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PREFACE 


This book is intended for an introductory course in digital logic design, which is a basic 
course in most electrical and computer engineering programs. A successful designer of 
digital logic circuits needs a good understanding of basic concepts and a firm grasp of the 
modern design approach that relies on computer-aided design (CAD) tools. 

The main goals of the book are (1) to teach students the fundamental concepts in 
classical manual digital design and (2) illustrate clearly the way in which digital circuits 
are designed today, using CAD tools. Even though modern designers no longer use manual 
techniques, except in rare circumstances, our motivation for teaching such techniques is 
to give students an intuitive feeling for how digital circuits operate. Also, the manual 
techniques provide an illustration of the types of manipulations performed by CAD tools, 
giving students an appreciation of the benefits provided by design automation. Throughout 
the book, basic concepts are introduced by way of examples that involve simple circuit 
designs, which we perform using both manual techniques and modern CAD-tool-based 
methods. Having established the basic concepts, more complex examples are then provided, 
using the CAD tools. Thus our emphasis is on modern design methodology to illustrate 
how digital design is carried out in practice today. 


TECHNOLOGY 


The book discusses modern digital circuit implementation technologies. The emphasis is on 
programmable logic devices (PLDs), which is the most appropriate technology for use in a 
textbook for two reasons. First, PLDs are widely used in practice and are suitable for almost 
all types of digital circuit designs. In fact, students are more likely to be involved in PLD- 
based designs at some point in their careers than in any other technology. Second, circuits 
are implemented in PLDs by end-user programming. Therefore, students can be provided 
with an opportunity, in a laboratory setting, to implement the book’s design examples in 
actual chips. Students can also simulate the behavior of their designed circuits on their own 
computers. We use the two most popular types of PLDs for targeting of designs: complex 
programmable logic devices (CPLDs) and field-programmable gate arrays (FPGAs). 

We emphasize the use of a hardware description language in specifying the logic cir- 
cuits, because the HDL-based approach is the most efficient design method to use in practice. 
We describe in detail the IEEE Standard Verilog HDL language and use it extensively in 
examples. 


vi 


PREFACE 


SCOPE OF THE BOOK 


This edition of the book has been extensively restructured. All of the material that should 
be covered in a one-semester course is now included in Chapters | to 6. More advanced 
material is presented in Chapters 7 to 11. 

Chapter | provides a general introduction to the process of designing digital systems. 
It discusses the key steps in the design process and explains how CAD tools can be used 
to automate many of the required tasks. It also introduces the representation of digital 
information. 

Chapter 2 introduces the logic circuits. It shows how Boolean algebra is used to 
represent such circuits. It introduces the concepts of logic circuit synthesis and optimization, 
and shows how logic gates are used to implement simple circuits. It also gives the reader 
a first glimpse at Verilog, as an example of a hardware description language that may be 
used to specify the logic circuits. 

Chapter 3 concentrates on circuits that perform arithmetic operations. It discusses num- 
bers and shows how they can be manipulated using logic circuits. This chapter illustrates 
how Verilog can be used to specify the desired functionality and how CAD tools provide a 
mechanism for developing the required circuits. 

Chapter 4 presents combinational circuits that are used as building blocks. It includes 
the encoder, decoder, and multiplexer circuits. These circuits are very convenient for 
illustrating the application of many Verilog constructs, giving the reader an opportunity to 
discover more advanced features of Verilog. 

Storage elements are introduced in Chapter 5. The use of flip-flops to realize regular 
structures, such as shift registers and counters, is discussed. Verilog-specified designs of 
these structures are included. 

Chapter 6 gives a detailed presentation of synchronous sequential circuits (finite state 
machines). It explains the behavior of these circuits and develops practical design tech- 
niques for both manual and automated design. 

Chapter 7 is a discussion of a number of practical issues that arise in the design of real 
systems. It highlights problems often encountered in practice and indicates how they can 
be overcome. Examples of larger circuits illustrate a hierarchical approach in designing 
digital systems. Complete Verilog code for these circuits is presented. 

Chapter 8 deals with more advanced techniques for optimized implementation of logic 
functions. It presents algorithmic techniques for optimization. It also explains how logic 
functions can be specified using a cubical representation as well as using binary decision 
diagrams. 

Asynchronous sequential circuits are discussed in Chapter 9. While this treatment is 
not exhaustive, it provides a good indication of the main characteristics of such circuits. 
Even though the asynchronous circuits are not used extensively in practice, they provide 
an excellent vehicle for gaining a deeper understanding of the operation of digital circuits 
in general. They illustrate the consequences of propagation delays and race conditions that 
may be inherent in the structure of a circuit. 

Chapter 10 presents a complete CAD flow that the designer experiences when design- 
ing, implementing, and testing a digital circuit. 
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PREFACE 


Chapter 11 introduces the topic of testing. A designer of logic circuits has to be aware 
of the need to test circuits and should be conversant with at least the most basic aspects of 
testing. 

Appendix A provides a complete summary of Verilog features. Although use of Verilog 
is integrated throughout the book, this appendix provides a convenient reference that the 
reader can consult from time to time when writing Verilog code. 

The electronic aspects of digital circuits are presented in Appendix B. This appendix 
shows how the basic gates are built using transistors and presents various factors that affect 
circuit performance. The emphasis is on the latest technologies, with particular focus on 
CMOS technology and programmable logic devices. 


WHat CAN BE COVERED IN A COURSE 


Much of the material in the book can be covered in 2 one-quarter courses. A good coverage 
of the most important material can be achieved in a single one-semester, or even a one- 
quarter course. This is possible only if the instructor does not spend too much time teaching 
the intricacies of Verilog and CAD tools. To make this approach possible, we organized 
the Verilog material in a modular style that is conducive to self-study. Our experience in 
teaching different classes of students at the University of Toronto shows that the instructor 
may spend only three to four lecture hours on Verilog, describing how the code should be 
structured, including the use of design hierarchy, using scalar and vector variables, and on 
the style of code needed to specify sequential circuits. The Verilog examples given in the 
book are largely self-explanatory, and students can understand them easily. 

The book is also suitable for a course in logic design that does not include exposure to 
Verilog. However, some knowledge of Verilog, even at a rudimentary level, is beneficial 
to the students, and it is a great preparation for a job as a design engineer. 


One-Semester Course 


The following material should be covered in lectures: 


e Chapter 1—all sections. 
e Chapter 2—all sections. 
e¢ Chapter 3—Sections 3.1 to 3.5. 
e Chapter 4—all sections. 
e Chapter 5—all sections. 
e Chapter 6—all sections. 


One-Quarter Course 


In a one-quarter course the following material can be covered: 


e Chapter 1—all sections. 


e Chapter 2—all sections. 
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e Chapter 3—Sections 3.1 to 3.3 and Section 3.5. 
e Chapter 4—all sections. 

e Chapter 5—all sections. 

e Chapter 6—Sections 6.1 to 6.4. 


VERILOG 


Verilog is acomplex language, which some instructors feel is too hard for beginning students 
to grasp. We fully appreciate this issue and have attempted to solve it. It is not necessary 
to introduce the entire Verilog language. In the book we present the important Verilog 
constructs that are useful for the design and synthesis of logic circuits. Many other language 
constructs, such as those that have meaning only when using the language for simulation 
purposes, are omitted. The Verilog material is introduced gradually, with more advanced 
features being presented only at points where their use can be demonstrated in the design 
of relevant circuits. 

The book includes more than 120 examples of Verilog code. These examples illustrate 
how Verilog is used to describe a wide range of logic circuits, from those that contain only 
a few gates to those that represent digital systems such as a simple processor. 

All of the examples of Verilog code presented in the book are provided on the Authors’ 
website at 


www.eecg.toronto.edu/~brown/Verilog_3e 


SOLVED PROBLEMS 


The chapters include examples of solved problems. They show how typical homework 
problems may be solved. 


HOMEWORK PROBLEMS 


More than 400 homework problems are provided in the book. Answers to selected problems 
are given at the back of the book. Solutions to all problems are available to instructors in 
the Solutions Manual that accompanies the book. 


POWERPOINT SLIDES AND SOLUTIONS MANUAL 


PowerPoint slides that contain all of the figures in the book are available on the Authors’ 
website. Instructors can request access to these slides, as well as access to the Solutions 
Manual for the book, at: 


www.mhhe.com/brownvranesic 
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CAD TooLs 


Modern digital systems are quite large. They contain complex logic circuits that would be 
difficult to design without using good CAD tools. Our treatment of Verilog should enable the 
reader to develop Verilog code that specifies logic circuits of varying degrees of complexity. 
To gain proper appreciation of the design process, it is highly beneficial to implement the 
designs using commercially-available CAD tools. Some excellent CAD tools are available 
free of charge. For example, the Altera Corporation has its Quartus II CAD software, which 
is widely used for implementing designs in programmable logic devices such as FPGAs. 
The Web Edition of the Quartus II software can be downloaded from Altera’s website and 
used free of charge, without the need to obtain a license. In previous editions of this 
book a set of tutorials for using the Quartus II software was provided in the appendices. 
Those tutorials can now be found on the Authors’ website. Another set of useful tutorials 
about Quartus II can be found on Altera’s University Program website, which is located at 
www.altera.com/education/univ. 
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INTRODUCTION 


CHAPTER OBJECTIVES 


In this chapter you will be introduced to: 


e Digital hardware components 

e An overview of the design process 
e Binary numbers 

e Digital representation of information 


2 CHAPTER 1 e INTRODUCTION 


This book is about logic circuits—the circuits from which computers are built. Proper understanding of 
logic circuits is vital for today’s electrical and computer engineers. These circuits are the key ingredient of 
computers and are also used in many other applications. They are found in commonly-used products like 
music and video players, electronic games, digital watches, cameras, televisions, printers, and many household 
appliances, as well as in large systems, such as telephone networks, Internet equipment, television broadcast 
equipment, industrial control units, and medical instruments. In short, logic circuits are an important part of 
almost all modern products. 

The material in this book will introduce the reader to the many issues involved in the design of logic 
circuits. It explains the key ideas with simple examples and shows how complex circuits can be derived 
from elementary ones. We cover the classical theory used in the design of logic circuits because it provides 
the reader with an intuitive understanding of the nature of such circuits. But, throughout the book, we 
also illustrate the modern way of designing logic circuits using sophisticated computer aided design (CAD) 
software tools. The CAD methodology adopted in the book is based on the industry-standard design language 
called the Verilog hardware description language. Design with Verilog is first introduced in Chapter 2, and 
usage of Verilog and CAD tools is an integral part of each chapter in the book. 

Logic circuits are implemented electronically, using transistors on an integrated circuit chip. Commonly 
available chips that use modern technology may contain more than a billion transistors, as in the case of some 
computer processors. The basic building blocks for such circuits are easy to understand, but there is nothing 
simple about a circuit that contains billions of transistors. The complexity that comes with large circuits can 
be handled successfully only by using highly-organized design techniques. We introduce these techniques in 
this chapter, but first we briefly describe the hardware technology used to build logic circuits. 


| 1.1 DIGITAL HARDWARE 


Logic circuits are used to build computer hardware, as well as many other types of products. 
All such products are broadly classified as digital hardware. The reason that the name digital 
is used will be explained in Section 1.5—it derives from the way in which information is 
represented in computers, as electronic signals that correspond to digits of information. 

The technology used to build digital hardware has evolved dramatically over the past 
few decades. Until the 1960s logic circuits were constructed with bulky components, such 
as transistors and resistors that came as individual parts. The advent of integrated circuits 
made it possible to place a number of transistors, and thus an entire circuit, on a single chip. 
In the beginning these circuits had only a few transistors, but as the technology improved 
they became more complex. Integrated circuit chips are manufactured on a silicon wafer, 
such as the one shown in Figure 1.1. The wafer is cut to produce the individual chips, 
which are then placed inside a special type of chip package. By 1970 it was possible to 
implement all circuitry needed to realize a microprocessor on a single chip. Although early 
microprocessors had modest computing capability by today’s standards, they opened the 
door for the information processing revolution by providing the means for implementation 
of affordable personal computers. 
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Figure 1.1 —_A silicon wafer (courtesy of Altera Corp.). 


About 30 years ago Gordon Moore, chairman of Intel Corporation, observed that in- 
tegrated circuit technology was progressing at an astounding rate, approximately doubling 
the number of transistors that could be placed on a chip every two years. This phenomenon, 
informally known as Moore’s law, continues to the present day. Thus in the early 1990s 
microprocessors could be manufactured with a few million transistors, and by the late 1990s 
it became possible to fabricate chips that had tens of millions of transistors. Presently, chips 
can be manufactured containing billions of transistors. 

Moore’s law is expected to continue to hold true for a number of years. A consortium 
of integrated circuit associations produces a forecast of how the technology is expected 
to evolve. Known as the International Technology Roadmap for Semiconductors (ITRS) 
[1], this forecast discusses many aspects of technology, including the maximum number of 
transistors that can be manufactured ona single chip. Asample of data from the ITRS is given 
in Figure 1.2. It shows that chips with about 10 million transistors could be successfully 
manufactured in 1995, and this number has steadily increased, leading to today’s chips with 
over a billion transistors. The roadmap predicts that chips with as many as 100 billion 
transistors will be possible by the year 2022. There is no doubt that this technology will 
have a huge impact on all aspects of people’s lives. 

The designer of digital hardware may be faced with designing logic circuits that can be 
implemented on a single chip or designing circuits that involve a number of chips placed 
on a printed circuit board (PCB). Frequently, some of the logic circuits can be realized 
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Figure 1.2 = Anestimate of the maximum number of transistors per chip 
over time. 


in existing chips that are readily available. This situation simplifies the design task and 
shortens the time needed to develop the final product. Before we discuss the design process 
in detail, we should introduce the different types of integrated circuit chips that may be 
used. 

There exists a large variety of chips that implement various functions that are useful 
in the design of digital hardware. The chips range from simple ones with low function- 
ality to extremely complex chips. For example, a digital hardware product may require a 
microprocessor to perform some arithmetic operations, memory chips to provide storage 
capability, and interface chips that allow easy connection to input and output devices. Such 
chips are available from various vendors. 

For many digital hardware products, it is also necessary to design and build some logic 
circuits from scratch. For implementing these circuits, three main types of chips may be 
used: standard chips, programmable logic devices, and custom chips. These are discussed 
next. 


1.1.1 STANDARD CHIPS 


Numerous chips are available that realize some commonly-used logic circuits. We will 
refer to these as standard chips, because they usually conform to an agreed-upon standard 
in terms of functionality and physical configuration. Each standard chip contains a small 
amount of circuitry (usually involving fewer than 100 transistors) and performs a simple 
function. To build a logic circuit, the designer chooses the chips that perform whatever 
functions are needed and then defines how these chips should be interconnected to realize 
a larger logic circuit. 
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Standard chips were popular for building logic circuits until the early 1980s. However, 
as integrated circuit technology improved, it became inefficient to use valuable space on 
PCBs for chips with low functionality. Another drawback of standard chips is that the 
functionality of each chip is fixed and cannot be changed. 


1.1.2 PROGRAMMABLE LOGIC DEVICES 


In contrast to standard chips that have fixed functionality, it is possible to construct chips 
that contain circuitry which can be configured by the user to implement a wide range of 
different logic circuits. These chips have a very general structure and include a collection 
of programmable switches that allow the internal circuitry in the chip to be configured in 
many different ways. The designer can implement whatever functions are required for a 
particular application by setting the programmable switches as needed. The switches are 
programmed by the end user, rather than when the chip is manufactured. Such chips are 
known as programmable logic devices (PLDs). 

PLDs are available in a wide range of sizes, and can be used to implement very large 
logic circuits. The most commonly-used type of PLD is known as a field-programmable 
gate array (FPGA). The largest FPGAs contain billions of transistors [2, 3], and support the 
implementation of complex digital systems. An FPGA consists of a large number of small 
logic circuit elements, which can be connected together by using programmable switches 
in the FPGA. Because of their high capacity, and their capability to be tailored to meet the 
requirements of a specific application, FPGAs are widely used today. 


1.1.3 CUSTOM-DESIGNED CHIPS 


FPGAs are available as off-the-shelf components that can be purchased from different sup- 
pliers. Because they are programmable, they can be used to implement most logic circuits 
found in digital hardware. However, they also have a drawback in that the programmable 
switches consume valuable chip area and limit the speed of operation of implemented cir- 
cuits. Thus in some cases FPGAs may not meet the desired performance or cost objectives. 
In such situations it is possible to design a chip from scratch; namely, the logic circuitry that 
must be included on the chip is designed first and then the chip is manufactured by a com- 
pany that has the fabrication facilities. This approach is known as custom or semi-custom 
design, and such chips are often called application-specific integrated circuits (ASICs). 

The main advantage of a custom chip is that its design can be optimized for a specific 
task; hence it usually leads to better performance. It is possible to include a larger amount 
of logic circuitry in a custom chip than would be possible in other types of chips. The 
cost of producing such chips is high, but if they are used in a product that is sold in large 
quantities, then the cost per chip, amortized over the total number of chips fabricated, may 
be lower than the total cost of off-the-shelf chips that would be needed to implement the 
same function(s). Moreover, if a single chip can be used instead of multiple chips to achieve 
the same goal, then a smaller area is needed on a PCB that houses the chips in the final 
product. This results in a further reduction in cost. 
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A disadvantage of the custom-design approach is that manufacturing a custom chip 
often takes a considerable amount of time, on the order of months. In contrast, if an FPGA 
can be used instead, then the chips are programmed by the end user and no manufacturing 
delays are involved. 


1.2 THE DESIGN PROCESS 


The availability of computer-based tools has greatly influenced the design process in a wide 
variety of environments. For example, designing an automobile is similar in the general 
approach to designing a furnace or a computer. Certain steps in the development cycle must 
be performed if the final product is to meet the specified objectives. 

The flowchart in Figure 1.3 depicts a typical development process. We assume that 
the process is to develop a product that meets certain expectations. The most obvious 
requirements are that the product must function properly, that it must meet an expected 
level of performance, and that its cost should not exceed a given target. 

The process begins with the definition of product specifications. The essential features 
of the product are identified, and an acceptable method of evaluating the implemented 
features in the final product is established. The specifications must be tight enough to 
ensure that the developed product will meet the general expectations, but should not be 
unnecessarily constraining (that is, the specifications should not prevent design choices 
that may lead to unforeseen advantages). 

From a complete set of specifications, it is necessary to define the general structure of 
an initial design of the product. This step is difficult to automate. It is usually performed by 
a human designer because there is no clear-cut strategy for developing a product’s overall 
structure—it requires considerable design experience and intuition. 

After the general structure is established, CAD tools are used to work out the details. 
Many types of CAD tools are available, ranging from those that help with the design 
of individual parts of the system to those that allow the entire system’s structure to be 
represented in a computer. When the initial design is finished, the results must be verified 
against the original specifications. Traditionally, before the advent of CAD tools, this step 
involved constructing a physical model of the designed product, usually including just the 
key parts. Today it is seldom necessary to build a physical model. CAD tools enable 
designers to simulate the behavior of incredibly complex products, and such simulations 
are used to determine whether the obtained design meets the required specifications. If 
errors are found, then appropriate changes are made and the verification of the new design 
is repeated through simulation. Although some design flaws may escape detection via 
simulation, usually all but the most subtle problems are discovered in this way. 

When the simulation indicates that the design is correct, a complete physical prototype 
of the product is constructed. The prototype is thoroughly tested for conformance with the 
specifications. Any errors revealed in the testing must be fixed. The errors may be minor, 
and often they can be eliminated by making small corrections directly on the prototype of 
the product. In case of large errors, it is necessary to redesign the product and repeat the 
steps explained above. When the prototype passes all the tests, then the product is deemed 
to be successfully designed and it can go into production. 
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Figure 1.3. The development process. 
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1.3. STRUCTURE OF A COMPUTER 


To understand the role that logic circuits play in digital systems, consider the structure of 
a typical computer, as illustrated in Figure 1.4a. The computer case houses a number of 
printed circuit boards (PCBs), a power supply, and (not shown in the figure) storage units, 
like a hard disk and DVD or CD-ROM drives. Each unit is plugged into a main PCB, called 
the motherboard. As indicated on the bottom of the figure, the motherboard holds several 
integrated circuit chips, and it provides slots for connecting other PCBs, such as audio, 
video, and network boards. 

Figure 1.4 illustrates the structure of an integrated circuit chip. The chip comprises 
a number of subcircuits, which are interconnected to build the complete circuit. Examples 
of subcircuits are those that perform arithmetic operations, store data, or control the flow 
of data. Each of these subcircuits is a logic circuit. As shown in the middle of the figure, a 
logic circuit comprises a network of connected logic gates. Each logic gate performs a very 
simple function, and more complex operations are realized by connecting gates together. 
Logic gates are built with transistors, which in turn are implemented by fabricating various 
layers of material on a silicon chip. 

This book is primarily concerned with the center portion of Figure 1.4b—the design 
of logic circuits. We explain how to design circuits that perform important functions, such 
as adding, subtracting, or multiplying numbers, counting, storing data, and controlling the 
processing of information. We show how the behavior of such circuits is specified, how 
the circuits are designed for minimum cost or maximum speed of operation, and how the 
circuits can be tested to ensure correct operation. We also briefly explain how transistors 
operate, and how they are built on silicon chips. 


1.4 LocGic Circuit DESIGN IN THIS BOOK 


In this book we use a modern design approach based on the Verilog hardware description 
language and CAD tools to illustrate many aspects of logic circuit design. We selected 
this technology because it is widely used in industry and because it enables the readers to 
implement their designs in FPGA chips, as discussed below. This technology is particularly 
well-suited for educational purposes because many readers have access to facilities for using 
CAD tools and programming FPGA devices. 

To gain practical experience and a deeper understanding of logic circuits, we advise 
the reader to implement the examples in this book using CAD software. Most of the ma- 
jor vendors of CAD systems provide their software at no cost to university students for 
educational use. Some examples are Altera, Cadence, Mentor Graphics, Synopsys, and 
Xilinx. The CAD systems offered by any of these companies can be used equally well 
with this book. Two CAD systems that are particularly well-suited for use with this book 
are the Quartus II software from Altera and the ISE software from Xilinx. Both of these 
CAD systems support all phases of the design cycle for logic circuits and are powerful 
and easy to use. The reader is encouraged to visit the website for these companies, where 
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Figure 1.4 — A digital hardware system (Part 6). 


the software tools and tutorials that explain their use can be downloaded and installed onto 
any personal computer. 

To facilitate experimentation with logic circuits, some FPGA manufacturers provide 
special PCBs that include one or more FPGA chips and an interface to a personal computer. 
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Figure 1.5 = An FPGA board. 


Once a logic circuit has been designed using the CAD tools, the circuit can be programmed 
into an FPGA on the board. Inputs can then be applied to the FPGA by way of switches 
and other devices, and the generated outputs can be examined. An example of such a board 
is depicted in Figure 1.5. This type of board is an excellent vehicle for learning about 
logic circuits, because it provides a collection of simple input and output devices. Many 
illustrative experiments can be carried out by designing and implementing logic circuits 
using the FPGA chip on the board. 


1.5 DIGITAL REPRESENTATION OF INFORMATION 


In Section 1.1 we mentioned that information is represented in logic circuits as electronic 
signals. Each of these signals can be thought of as representing one digit of information. To 
make the design of logic circuits easier, each digit is allowed to take on only two possible 
values, usually denoted as 0 and 1. These logic values are implemented as voltage levels in 
a circuit; the value 0 is usually represented as 0 V (ground), and the value | is the voltage 
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level of the circuit’s power supply. As we discuss in Appendix B, typical power-supply 
voltages in logic circuits range from 1 V DC to 5 V DC. 

In general, all information in logic circuits is represented as combinations of 0 and 1 
digits. Before beginning our discussion of logic circuits in Chapter 2, it will be helpful to 
examine how numbers, alphanumeric data (text), and other information can be represented 
using the digits 0 and 1. 


1.5.1 BINARY NUMBERS 


In the familiar decimal system, a number consists of digits that have 10 possible values, 
from 0 to 9, and each digit represents a multiple of a power of 10. For example, the number 
8547 represents 8 x 10°+ 5 x 107+ 4 x 10'+7 x 10°. We do not normally write the 
powers of 10 with the number, because they are implied by the positions of the digits. In 
general, a decimal integer is expressed by an n-tuple comprising n decimal digits 


D= dy—1dy—2 7% did 
which represents the value 
VD) =d,-74 x 10"! + dee & 1077 + + 4 dy x 10" + a « 10° 


This is referred to as the positional number representation. 

Because the digits have 10 possible values and each digit is weighted as a power of 
10, we say that decimal numbers are base-10 numbers. Decimal numbers are familiar, 
convenient, and easy to understand. However, since digital circuits represent information 
using only the values 0 and 1, it is not practical to have digits that can assume ten values. 
In these circuits it is more appropriate to use the binary, or base-2, system which has only 
the digits 0 and 1. Each binary digit is called a bit. In the binary number system, the same 
positional number representation is used so that 


B= by-1bp-2 +--+ bi bo 
represents an integer that has the value 


V(B) = bey % Ot Bye Ke hy eg x [1.1] 
n—-1 
= yo di x 2! 
i=0 
For example, the binary number 1101 represents the value 


V=1x2+1x274+0x2!41x 2° 


Because a particular digit pattern has different meanings for different bases, we will indicate 
the base as a subscript when there is potential for confusion. Thus to specify that 1101 is 
a base-2 number, we will write (1101)2. Evaluating the preceding expression for V gives 
V=8+4+1= 13. Hence 


(1101) = (13)10 
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Table 1.1 Numbers in decimal 


and binary. 
Decimal Binary 
representation representation 
00 0000 
01 0001 
02 0010 
03 0011 
04 0100 
05 0101 
06 0110 
07 0111 
08 1000 
09 1001 
10 1010 
11 1011 
12 1100 
13 1101 
14 1110 
15 1111 


The range of integers that can be represented by a binary number depends on the number of 
bits used. Table 1.1 lists the first 15 positive integers and shows their binary representations 
using four bits. An example of a larger number is (10110111). = (183),. In general, using 
n bits allows representation of positive integers in the range 0 to 2” — 1. 

In a binary number the right-most bit is usually referred to as the least-significant bit 
(LSB). The left-most bit, which has the highest power of 2 associated with it, is called the 
most-significant bit (MSB). In digital systems it is often convenient to consider several bits 
together as a group. A group of four bits is called a nibble, and a group of eight bits is called 
a byte. 


1.5.2 CONVERSION BETWEEN DECIMAL AND BINARY SYSTEMS 


A binary number is converted into a decimal number simply by applying Equation 1.1 and 
evaluating it using decimal arithmetic. Converting a decimal number into a binary number 
is not quite as straightforward, because we need to construct the number by using powers of 
2. For example, the number (17) 9 is 2+ + 2° = (10001)>, and the number (50) 49 is 2° + 
2+ + 2! = (110010)>. In general, the conversion can be performed by successively dividing 
the decimal number by 2 as follows. Suppose that a decimal number D = dy_; ---dido, 
with a value V, is to be converted into a binary number B = b,_; - -- bab, bo. Then, we can 
write V in the form 


V = by KO fs hy ey ee 
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Convert (857) 19 
Remainder 

857+2 = 428 1 LSB 
428+2 = 214 0 
214+ 2 107 0 
107+ 2 53 1 

53+ 2 26 1 

26+ 2 13 0 

13+ 2 6 1 

6+ 2 3 0 

3+2 = 1 1 

1+2 = 0O 1 MSB 


Result is (1101011001)> 


Figure 1.6 = Conversion from decimal to binary. 


If we now divide V by 2, the result is 


Y = Dy-1 Oe Basak By O45, 4 

2 2 

The quotient of this integer division is b,_; x 2-2 4... 4 by x 2+ dj, and the remainder 
is bo. If the remainder is 0, then bo = 0; if it is 1, then bp = 1. Observe that the quotient 
is just another binary number, which comprises n — | bits, rather than n bits. Dividing this 
number by 2 yields the remainder b. The new quotient is 


Dy-1 x gra 4 -+++ bo 


Continuing the process of dividing the new quotient by 2, and determining one bit in each 
step, will produce all bits of the binary number. The process continues until the quotient 
becomes 0. Figure 1.6 illustrates the conversion process, using the example (857) 19 = 
(1101011001). Note that the least-significant bit (LSB) is generated first and the most- 
significant bit (MSB) is generated last. 

So far, we have considered only the representation of positive integers. In Chapter 3 
we will complete the discussion of number representation by explaining how negative 
numbers are handled and how fixed-point and floating-point numbers may be represented. 
We will also explain how arithmetic operations are performed in computers. 


1.5.3 ASCII CHARACTER CODE 


Alphanumeric information, such as letters and numbers typed on a computer keyboard, is 
represented as codes consisting of 0 and | digits. The most common code used for this type 
of information is known as the ASCII code, which stands for the American Standard Code 
for Information Interchange. The code specified by this standard is presented in Table 1.2. 
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Table 1.2 The seven-bit ASCII code. 

Bit Bit positions 654 

positions 

3210 000 001 010 O11 100 101 110 111 
0000 NUL DLE SPACE 0 @ 1p % ) 
0001 SOH DC1 ! 1 A Q a 

0010 STX DC2 “ 2 B R b r 
0011 ETX DC3 # 3 € S C s 
0100 EOT DC4 $ 4 D aly d t 
0101 ENQ NAK % 5) 15) U e u 
0110 ACK SYN & 6 F Vv f Vv 
O1l1 BEL ETB ; 7 G Ww g w 
1000 BS CAN ( 8 H x h x 
1001 HT EM ) g) I YG i y 
1010 LF SUB Hy J Zi j Zz 
1011 VT ESC + : K [ k { 
1100 FF FS ‘ < IL; \ 1 | 
1101 CR GS - = M ] m } 
1110 SO RS = N ‘i n a 
1111 SI US / v oO — ° DEL 
NUL Null/Idle SI Shift in 

SOH Start of header DLE Data link escape 

STX Start of text DC1-DC4 Device control 

ETX End of text NAK Negative acknowledgement 
EOT End of transmission SYN Synchronous idle 

ENQ Enquiry ETB End of transmitted block 
ACQ Acknowledgement CAN Cancel (error in data) 

BEL Audible signal EM End of medium 

BS Back space SUB Special sequence 

HT Horizontal tab ESC Escape 

IER Line feed FS File separator 

VT Vertical tab GS Group separator 

FEF Form feed RS Record separator 

CR Carriage return US Unit separator 

SO Shift out DEL Delete/Idle 


Bit positions of code format = 


6b beh fo] 


15 


16 
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The ASCII code uses seven-bit patterns to denote 128 different characters. Ten of 
the characters are decimal digits 0 to 9. As the table shows, the high-order bits have the 
same pattern, bebsb4 = 011, for all 10 digits. Each digit is identified by the low-order 
four bits, b3_9, using the binary patterns for these digits. Capital and lowercase letters are 
encoded in a way that makes sorting of textual information easy. The codes for A to Z are in 
ascending numerical sequence, which means that the task of sorting letters (or words) can 
be accomplished by a simple arithmetic comparison of the codes that represent the letters. 

In addition to codes that represent characters and letters, the ASCII code includes 
punctuation marks such as !_ and ?, commonly used symbols such as & and %, and a 
collection of control characters. The control characters are those needed in computer 
systems to handle and transfer data among various devices. For example, the carriage 
return character, which is abbreviated as CR in the table, indicates that the carriage, or 
cursor position, of an output device, such as a printer or display, should return to the left- 
most column. 

The ASCII code is used to encode information that is handled as text. Itis not convenient 
for representation of numbers that are used as operands in arithmetic operations. For this 
purpose, it is best to convert ASCH-encoded numbers into a binary representation that we 
discussed before. 

The ASCII standard uses seven bits to encode a character. In computer systems a more 
natural size is eight bits, or one byte. There are two common ways of fitting an ASCII- 
encoded character into a byte. One is to set the eighth bit, b7, to 0. Another is to use this 
bit to indicate the parity of the other seven bits, which means showing whether the number 
of 1s in the seven-bit code is even or odd. We discuss parity in Chapter 4. 


1.5.4 DIGITAL AND ANALOG INFORMATION 


Binary numbers can be used to represent many types of information. For example, they 
can represent music that is stored in a personal music player. Figure 1.7 illustrates a music 
player, which contains an electronic memory for storing music files. A music file comprises 
a sequence of binary numbers that represent tones. To convert these binary numbers into 
sound, a digital-to-analog (D/A) converter circuit is used. It converts digital values into 
corresponding voltage levels, which create an analog voltage signal that drives the speakers 
inside the headphones. The binary values stored in the music player are referred to as digital 
information, whereas the voltage signal that drives the speakers is analog information. 


1.6 THEORY AND PRACTICE 


Modern design of logic circuits depends heavily on CAD tools, but the discipline of logic 
design evolved long before CAD tools were invented. This chronology is quite obvious 
because the very first computers were built with logic circuits, and there certainly were no 
computers available on which to design them! 


1.6 THEORY AND PRACTICE 


Headphones 


11000100110 
10010001000 
11111000101 


00101001010 


Figure 1.7 —_ Using digital technology to represent music. 


Numerous manual design techniques have been developed to deal with logic circuits. 
Boolean algebra, which we will introduce in Chapter 2, was adopted as a mathematical 
means for representing such circuits. An enormous amount of “theory” was developed 
showing how certain design issues may be treated. To be successful, a designer had to 
apply this knowledge in practice. 

CAD tools not only made it possible to design incredibly complex circuits but also 
made the design work much simpler in general. They perform many tasks automatically, 
which may suggest that today’s designer need not understand the theoretical concepts used 
in the tasks performed by CAD tools. An obvious question would then be, Why should one 
study the theory that is no longer needed for manual design? Why not simply learn how to 
use the CAD tools? 

There are three big reasons for learning the relevant theory. First, although the CAD 
tools perform the automatic tasks of optimizing a logic circuit to meet particular design 
objectives, the designer has to give the original description of the logic circuit. If the 
designer specifies a circuit that has inherently bad properties, then the final circuit will also 
be of poor quality. Second, the algebraic rules and theorems for design and manipulation 
of logic circuits are directly implemented in today’s CAD tools. It is not possible for a user 
of the tools to understand what the tools do without grasping the underlying theory. Third, 
CAD tools offer many optional processing steps that a user can invoke when working on 
a design. The designer chooses which options to use by examining the resulting circuit 
produced by the CAD tools and deciding whether it meets the required objectives. The 
only way that the designer can know whether or not to apply a particular option in a given 
situation is to know what the CAD tools will do if that option is invoked—again, this 
implies that the designer must be familiar with the underlying theory. We discuss the logic 
circuit theory extensively in this book, because it is not possible to become an effective 
logic circuit designer without understanding the fundamental concepts. 
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There is another good reason to learn some logic circuit theory even if it were not 
required for CAD tools. Simply put, it is interesting and intellectually challenging. In the 
modern world filled with sophisticated automatic machinery, it is tempting to rely on tools as 
a substitute for thinking. However, in logic circuit design, as in any type of design process, 
computer-based tools are not a substitute for human intuition and innovation. Computer- 
based tools can produce good digital hardware designs only when employed by a designer 
who thoroughly understands the nature of logic circuits. 


*1.1 


1.2 


1.3 


*1.4 


PROBLEMS 


Answers to problems marked by an asterisk are given at the back of the book. 


Convert the following decimal numbers into binary, using the method shown in Figure 1.6. 
(a) (20) 10 

(b) (100) 10 

(c) (129)10 

(d) (260)10 

(e) (10240) 10 


Convert the following decimal numbers into binary, using the method shown in Figure 1.6. 
(a) (30) 10 

(b) (110) 10 

(c) (259)10 

(d) (500) 10 

(e) (20480) 10 


Convert the following decimal numbers into binary, using the method shown in Figure 1.6. 
(a) (1000) 10 

(b) (10000) io 

(c) (100000) 19 

(c) (1000000) 10 


In Figure 1.6 we show how to convert a decimal number into binary by successively dividing 
by 2. Another way to derive the answer is to constuct the number by using powers of 2. 
For example, if we wish to convert the number (23) ;9, then the largest power of 2 that is 
not larger than 23 is 24 = 16. Hence, the binary number will have five bits and the most- 
significant bit is by = 1. We then perform the subtraction 23 — 16 = 7. Now, the largest 
power of 2 that is not larger than 7 is 27 = 4. Hence, b3 = 0 (because 2° = 8 is larger than 
7) and by = 1. Continuing this process gives 


23 = 16+44+2+1 
= 944.97 4:9) 4.59 
= 10000 + 00100 + 00010 + 00001 
= 10111 
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Using this method, convert the following decimal numbers into binary. 
(a) (17)10 

(b) (33)10 

(c) (67)10 

(d) (130) 10 

(e) (2560) 10 

(£) (51200) jo 


1.5. Repeat Problem 3 using the method described in Problem 4. 


*1.6 Convert the following binary numbers into decimal. 
(a) (1001)2 
(b) (11100)> 
(c) (111111) 
(d) (101010101010). 


1.7 Convert the following binary numbers into decimal. 
(a) (110010) 
(b) (1100100). 
(c) (11001000)> 
(d) (110010000). 


*1.8 What is the minimum number of bits needed to represent the following decimal numbers 
in binary? 
(a) (270) 10 
(b) (520) 10 
(c) (780) 10 
(d) (1029) 10 


1.9 Repeat Problem 8 for the following decimal numbers: 
(a) (111)i0 
(b) (333)10 
(c) (555)10 
(d) (1111)i0 
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INTRODUCTION TO LOGIC CIRCUITS 


CHAPTER OBJECTIVES 


In this chapter you will be introduced to: 


e Logic functions and circuits 

e Boolean algebra for dealing with logic functions 

e Logic gates and synthesis of simple circuits 

e CAD tools and the Verilog hardware description language 


e Minimization of functions and Karnaugh maps 
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The study of logic circuits is motivated mostly by their use in digital computers. But such circuits also form 
the foundation of many other digital systems, such as those that perform control applications or are involved in 
digital communications. All such applications are based on some simple logical operations that are performed 
on input information. 

In Chapter | we showed that information in computers is represented as electronic signals that can have 
two discrete values. Although these values are implemented as voltage levels in a circuit, we refer to them 
simply as logic values, 0 and 1. Any circuit in which the signals are constrained to have only some number of 
discrete values is called a logic circuit. Logic circuits can be designed with different numbers of logic values, 
such as three, four, or even more, but in this book we deal only with the binary logic circuits that have two 
logic values. 

Binary logic circuits have the dominant role in digital technology. We hope to provide the reader with 
an understanding of how these circuits work, how are they represented in mathematical notation, and how 
are they designed using modern design automation techniques. We begin by introducing some basic concepts 
pertinent to the binary logic circuits. 


| 2.1 VARIABLES AND FUNCTIONS 


The dominance of binary circuits in digital systems is a consequence of their simplicity, 
which results from constraining the signals to assume only two possible values. The simplest 
binary element is a switch that has two states. If a given switch is controlled by an input 
variable x, then we will say that the switch is open if x = 0 and closed if x = 1, as illustrated 
in Figure 2.1a. We will use the graphical symbol in Figure 2.1b to represent such switches 
in the diagrams that follow. Note that the control input x is shown explicitly in the symbol. 
In Appendix B we explain how such switches are implemented with transistors. 

Consider a simple application of a switch, where the switch turns a small lightbulb on 
or off. This action is accomplished with the circuit in Figure 2.2a. A battery provides the 
power source. The lightbulb glows when a sufficient amount of current passes through it. 


x=0 x= 1 


(a) Two states of a switch 


x 


(b) Symbol for a switch 


Figure 2.1 A binary switch. 


2.1. VARIABLES AND FUNCTIONS 


Battery 


(b) Using a ground connection as the return path 


Figure 2.2 A light controlled by a switch. 


The current flows when the switch is closed, that is, when x = 1. In this example the input 
that causes changes in the behavior of the circuit is the switch control x. The output is 
defined as the state (or condition) of the light, which we will denote by the letter L. If the 
light is on, we will say that L = 1. If the light is off, we will say that L = 0. Using this 
convention, we can describe the state of the light as a function of the input variable x. Since 
L=1ifx=1andLl= Oifx =0, we can say that 


L(x) =x 


This simple logic expression describes the output as a function of the input. We say that 
L(x) = x is a logic function and that x is an input variable. 

The circuit in Figure 2.2a can be found in an ordinary flashlight, where the switch is a 
simple mechanical device. In an electronic circuit the switch is implemented as a transistor 
and the light may be a light-emitting diode (LED). An electronic circuit is powered by a 
power supply of a certain voltage, usually in the range of | to 5 volts. One side of the power 
supply provides the circuit ground, as illustrated in Figure 2.2b. The circuit ground is a 
common reference point for voltages in the circuit. Rather than drawing wires in a circuit 
diagram for all nodes that return to the circuit ground, the diagram can be simplified by 
showing a connection to a ground symbol, as we have done for the bottom terminal of the 
light Lin the figure. In the circuit diagrams that follow we will use this convention, because 
it makes the diagrams look simpler. 

Consider now the possibility of using two switches to control the state of the light. Let 
x, and x be the control inputs for these switches. The switches can be connected either 
in series or in parallel as shown in Figure 2.3. Using a series connection, the light will be 
turned on only if both switches are closed. If either switch is open, the light will be off. 
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(b) The logical OR function (parallel connection) 


Figure 2.3 Two basic functions. 


This behavior can be described by the expression 


L(x), X2) = X1 +X 
where L=1lifx; =landx,=1, 


L = 0 otherwise. 


The “-” symbol is called the AND operator, and the circuit in Figure 2.3a is said to implement 
a logical AND function. 

The parallel connection of two switches is given in Figure 2.3b. In this case the light 
will be on if either the x; or x2 switch is closed. The light will also be on if both switches 
are closed. The light will be off only if both switches are open. This behavior can be stated 
as 


L(x), X2) = X1 + x2 


where L=1lifx,; =lorxy =lorifx,) =x» =1, 
L=O0ifx; =x =0. 


The + symbol is called the OR operator, and the circuit in Figure 2.3b is said to implement 
a logical OR function. It is important not to confuse the use of the + symbol with its more 
common meaning, which is for arithmetic addition. In this chapter the + symbol represents 
the logical OR operation unless otherwise stated. 

In the above expressions for AND and OR, the output L(x, x2) is a logic function with 
input variables x; and x2. The AND and OR functions are two of the most important logic 
functions. Together with some other simple functions, they can be used as building blocks 
for the implementation of all logic circuits. Figure 2.4 illustrates how three switches can be 


2.2. INVERSION 


Figure 2.5 — An inverting circuit. 


used to control the light in a more complex way. This series-parallel connection of switches 
realizes the logic function 


L(x1, X2, X3) = (41 + x2) + x3 


The light is on if x3 = 1 and, at the same time, at least one of the x; or x2 inputs is equal 
to l. 
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2.2 INVERSION 


So far we have assumed that some positive action takes place when a switch is closed, such 
as turning the light on. It is equally interesting and useful to consider the possibility that a 
positive action takes place when a switch is opened. Suppose that we connect the light as 
shown in Figure 2.5. In this case the switch is connected in parallel with the light, rather 
than in series. Consequently, a closed switch will short-circuit the light and prevent the 
current from flowing through it. Note that we have included an extra resistor in this circuit 
to ensure that the closed switch does not short-circuit the power supply. The light will be 
turned on when the switch is opened. Formally, we express this functional behavior as 


L(x) =x 
where L=1lifx=0O, 
L=0ifx=1 
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The value of this function is the inverse of the value of the input variable. Instead of 
using the word inverse, it is more common to use the term complement. Thus we say that 
L(x) is a complement of x in this example. Another frequently used term for the same 
operation is the NOT operation. There are several commonly used notations for indicating 
the complementation. In the preceding expression we placed an overbar on top of x. This 
notation is probably the best from the visual point of view. However, when complements 
are needed in expressions that are typed using a computer keyboard, which is often done 
when using CAD tools, it is impractical to use overbars. Instead, either an apostrophe is 
placed after the variable, or an exclamation mark (!), the tilde character (~), or the word 
NOT is placed in front of the variable to denote the complementation. Thus the following 
are equivalent: 


x =x =x = ~x = NOTx 


The complement operation can be applied to a single variable or to more complex 
operations. For example, if 
f 1, %2) = x1 +22 
then the complement of f is 


f(%1,%2) = x1 +x 


This expression yields the logic value | only when neither x; nor x2 is equal to 1, that is, 
when x; = x2 = 0. Again, the following notations are equivalent: 


XX = (+%)! = '!@) +42) = ~01 +x) = NOT (+x) 


2.3 TRUTH TABLES 


We have introduced the three most basic logic operations—AND, OR, and complement—by 
relating them to simple circuits built with switches. This approach gives these operations a 
certain “physical meaning.” The same operations can also be defined in the form of a table, 
called a truth table, as shown in Figure 2.6. The first two columns (to the left of the double 
vertical line) give all four possible combinations of logic values that the variables x; and x2 


xy Xo xX, ° Xo xy + Xo 

0 O 0 0 

0 1 0 1 

1 0 0 1 

1 1 it 1 
AND OR 


Figure 2.6 — A truth table for the AND and OR operations. 


2.4 LoGic GATES AND NETWORKS 


X,  X_  Xq || Xp XQ + Xz | XY t+ XQ + X3 
0 oO O 0 0 
0 oO 1 0 1 
0 Jt 0 0 1 
0 1 1 0 1 
1 0 0O 0 1 
1 oO 1 0 1 
1 1 O 0 1 
1 1 1 1 1 


Figure 2.7. — Three-input AND and OR operations. 


can have. The next column defines the AND operation for each combination of values of x; 
and x2, and the last column defines the OR operation. Because we will frequently need to 
refer to “combinations of logic values” applied to some variables, we will adopt a shorter 
term, valuation, to denote such a combination of logic values. 

The truth table is a useful aid for depicting information involving logic functions. We 
will use it in this book to define specific functions and to show the validity of certain func- 
tional relations. Small truth tables are easy to deal with. However, they grow exponentially 
in size with the number of variables. A truth table for three input variables has eight rows 
because there are eight possible valuations of these variables. Such a table is given in 
Figure 2.7, which defines three-input AND and OR functions. For four input variables the 
truth table has 16 rows, and so on. In general, for n input variables the truth table has 2” 
rows. 

The AND and OR operations can be extended to n variables. An AND function of 
variables x;, x2, ..., X, has the value | only if all n variables are equal to 1. An OR function 
of variables x), x2, ...,X, has the value | if one or more of the variables is equal to 1. 
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The three basic logic operations introduced in the previous sections can be used to implement 
logic functions of any complexity. A complex function may require many of these basic 
operations for its implementation. Each logic operation can be implemented electronically 
with transistors, resulting in a circuit element called a logic gate. A logic gate has one or 
more inputs and one output that is a function of its inputs. It is often convenient to describe 
a logic circuit by drawing a circuit diagram, or schematic, consisting of graphical symbols 
representing the logic gates. The graphical symbols for the AND, OR, and NOT gates are 
shown in Figure 2.8. The figure indicates on the left side how the AND and OR gates are 
drawn when there are only a few inputs. On the right side it shows how the symbols are 
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xy 


X5 


Xx 
1 

=D Saeco. 
X4 e 


(c) NOT gate 


Figure 2.8 — The basic gates. 


f = (4, +X) + x3 


Figure 2.9 The function from Figure 2.4. 


augmented to accommodate a greater number of inputs. We show how logic gates are built 
using transistors in Appendix B. 

A larger circuit is implemented by a network of gates. For example, the logic function 
from Figure 2.4 can be implemented by the network in Figure 2.9. The complexity of a 
given network has a direct impact on its cost. Because it is always desirable to reduce 
the cost of any manufactured product, it is important to find ways for implementing logic 
circuits as inexpensively as possible. We will see shortly that a given logic function can 


2.4 LoGic GATES AND NETWORKS 


be implemented with a number of different networks. Some of these networks are simpler 
than others, hence searching for the solutions that entail minimum cost is prudent. 

In technical jargon a network of gates is often called a logic network or simply a logic 
circuit. We will use these terms interchangeably. 


2.4.1 ANALYSIS OF A LOGIC NETWORK 


A designer of digital systems is faced with two basic issues. For an existing logic network, it 
must be possible to determine the function performed by the network. This task is referred 
to as the analysis process. The reverse task of designing a new network that implements a 
desired functional behavior is referred to as the synthesis process. The analysis process is 
rather straightforward and much simpler than the synthesis process. 

Figure 2.10a shows a simple network consisting of three gates. To analyze its functional 
behavior, we can consider what happens if we apply all possible combinations of the input 
signals to it. Suppose that we start by making x; = x. = 0. This forces the output of the 
NOT gate to be equal to | and the output of the AND gate to be 0. Because one of the inputs 
to the OR gate is 1, the output of this gate will be 1. Therefore, f = 1 ifx; = x. = 0. If we 
then let x} = 0 and x2 = 1, no change in the value of f will take place, because the outputs 
of the NOT and AND gates will still be 1 and 0, respectively. Next, if we apply x, = 1 
and x7 = 0, then the output of the NOT gate changes to 0 while the output of the AND gate 
remains at 0. Both inputs to the OR gate are then equal to 0; hence the value of f will be 0. 
Finally, let x} = x2 = 1. Then the output of the AND gate goes to 1, which in turn causes 
f to be equal to 1. Our verbal explanation can be summarized in the form of the truth table 
shown in Figure 2.10b. 


Timing Diagram 

We have determined the behavior of the network in Figure 2.10a by considering the four 
possible valuations of the inputs x, and x2. Suppose that the signals that correspond to these 
valuations are applied to the network in the order of our discussion; that is, (x1, x2) = (0, 0) 
followed by (0, 1), (1, 0), and (1, 1). Then changes in the signals at various points in the 
network would be as indicated in blue in the figure. The same information can be presented 
in graphical form, known as a timing diagram, as shown in Figure 2.10c. The time runs 
from left to right, and each input valuation is held for some fixed duration. The figure shows 
the waveforms for the inputs and output of the network, as well as for the internal signals 
at the points labeled A and B. 

The timing diagram in Figure 2.10c shows that changes in the waveforms at points A 
and B and the output f take place instantaneously when the inputs x; and x2 change their 
values. These idealized waveforms are based on the assumption that logic gates respond 
to changes on their inputs in zero time. Such timing diagrams are useful for indicating 
the functional behavior of logic circuits. However, practical logic gates are implemented 
using electronic circuits which need some time to change their states. Thus, there is a delay 
between a change in input values and a corresponding change in the output value of a gate. 
In chapters that follow we will use timing diagrams that incorporate such delays. 
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(a) Network that implements f = xX, +x, «x5 
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(b) Truth table 
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(c) Timing diagram 
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(d) Network that implements g = x, +x, 


Figure 2.10 = An example of logic networks. 


2.4 LoGic GATES AND NETWORKS 


Timing diagrams are used for many purposes. They depict the behavior of a logic 
circuit in a form that can be observed when the circuit is tested using instruments such as 
logic analyzers and oscilloscopes. Also, they are often generated by CAD tools to show 
the designer how a given circuit is expected to behave before it is actually implemented 
electronically. We will introduce the CAD tools later in this chapter and will make use of 
them throughout the book. 


Functionally Equivalent Networks 

Now consider the network in Figure 2.10d. Going through the same analysis procedure, 
we find that the output g changes in exactly the same way asf does in part (a) of the figure. 
Therefore, g(x), x2) =f (41,2), which indicates that the two networks are functionally 
equivalent; the output behavior of both networks is represented by the truth table in Figure 
2.10b. Since both networks realize the same function, it makes sense to use the simpler 
one, which is less costly to implement. 

In general, a logic function can be implemented with a variety of different networks, 
probably having different costs. This raises an important question: How does one find the 
best implementation for a given function? We will discuss some of the main approaches 
for synthesizing logic functions later in this chapter. For now, we should note that some 
manipulation is needed to transform the more complex network in Figure 2.10a into the 
network in Figure 2.10d. Since f (x1, x2) = X1 + x1 - x2 and g(x, x2) = X; +X, there must 
exist some rules that can be used to show the equivalence 


xX) + X1°X2 =X] + X2 


We have already established this equivalence through detailed analysis of the two circuits 
and construction of the truth table. But the same outcome can be achieved through algebraic 
manipulation of logic expressions. In Section 2.5 we will introduce a mathematical approach 
for dealing with logic functions, which provides the basis for modern design techniques. 
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As an example of a logic function, consider the diagram in Figure 2.11a. It includes two 
toggle switches that control the values of signals x and y. Each toggle switch can be pushed 
down to the bottom position or up to the top position. When a toggle switch is in the bottom 
position it makes a connection to logic value 0 (ground), and when in the top position it 
connects to logic value | (power supply level). Thus, these toggle switches can be used to 
set x and y to either 0 or 1. 

The signals x and y are inputs to a logic circuit that controls a light L. The required 
behavior is that the light should be on only if one, but not both, of the toggle switches 
is in the top position. This specification leads to the truth table in part (b) of the figure. 
Since L = 1 when x = 0 and y = | or when x = | and y = 0, we can implement this logic 
function using the network in Figure 2.11c. 

The reader may recognize the behavior of our light as being similar to that over a set of 
stairs in a house, where the light is controlled by two switches: one at the top of the stairs, 
and the other at the bottom. The light can be turned on or off by either switch because 


Example 2.1 


32 


CHAPTER 2 ° INTRODUCTION TO LOGIC CIRCUITS 


i Logic 0 0 0 
rag Ls 
circuit 0 1 1 
1 
J 1 0O 1 
4 1 1 0 
0 
(a) Two switches that control a light (b) Truth table 


(c) Logic network (d) XOR gate symbol 


Figure 2.11 == An example of a logic circuit. 


it follows the truth table in Figure 2.11b. This logic function, which differs from the OR 
function only when both inputs are equal to 1, is useful for other applications as well. It is 
called the exclusive-OR (XOR) function and is indicated in logic expressions by the symbol 
®. Thus, rather than writing L = x-y+x-y, we can write L = x © y. The XOR function 
has the logic-gate symbol illustrated in Figure 2.11d. 


Example 2.2 


In Chapter 1 we showed how numbers are represented in computers by using binary digits. 
As another example of logic functions, consider the addition of two one-digit binary numbers 
aand b. The four possible valuations of a, b and the resulting sums are given in Figure 2.12a 
(in this figure the + operator signifies addition). The sum S = s,5o has to be a two-digit 
binary number, because when a = b = | then S = 10. 

Figure 2.12b gives a truth table for the logic functions s; and so. From this table we 
can see that s,; = a-b and s) =a @b. The corresponding logic network is given in part 
(c) of the figure. This type of logic circuit, which adds binary numbers, is referred to as an 
adder circuit. We discuss circuits of this type in Chapter 3. 
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(b) Truth table (c) Logic network 


Figure 2.12 — Addition of binary numbers. 
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In 1849 George Boole published a scheme for the algebraic description of processes involved 
in logical thought and reasoning [1]. Subsequently, this scheme and its further refinements 
became known as Boolean algebra. It was almost 100 years later that this algebra found 
application in the engineering sense. In the late 1930s Claude Shannon showed that Boolean 
algebra provides an effective means of describing circuits built with switches [2]. The 
algebra can, therefore, be used to describe logic circuits. We will show that this algebra 
is a powerful tool that can be used for designing and analyzing logic circuits. The reader 
will come to appreciate that it provides the foundation for much of our modern digital 
technology. 


Axioms of Boolean Algebra 

Like any algebra, Boolean algebra is based on a set of rules that are derived from a 
small number of basic assumptions. These assumptions are called axioms. Let us assume 
that Boolean algebra involves elements that take on one of two values, 0 and 1. Assume 
that the following axioms are true: 
la. 0-0=0 
Ib 1+1=1 
2a. 1-l=1 
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2b. 0+0=0 

3a. O-1=1-0=0 
3b. 14+0=04+1=1 
4a. Ifx=0,thenx = 1 
4b. Ifx=1,thenx=0 


Single- Variable Theorems 


From the axioms we can define some rules for dealing with single variables. These 
rules are often called theorems. If x is a Boolean variable, then the following theorems 
hold: 


Sa. x-0=0 
5b. x+t+1= 

6a. x-l=x 
6b. x+0=x 
Ta. X+*xX=xX 

Tb. x+x=x 
8a. x-x=0 
8b. x+x=1 
9. X¥=x 


It is easy to prove the validity of these theorems by perfect induction, that is, by substituting 
the values x = 0 and x = | into the expressions and using the axioms given above. For 
example, in theorem Sa, if x = 0, then the theorem states that 0-0 = 0, which is true 
according to axiom la. Similarly, if x = 1, then theorem Sa states that 1-0 = 0, which 
is also true according to axiom 3a. The reader should verify that theorems 5a to 9 can be 
proven in this way. 


Duality 

Notice that we have listed the axioms and the single-variable theorems in pairs. This 
is done to reflect the important principle of duality. Given a logic expression, its dual is 
obtained by replacing all + operators with - operators, and vice versa, and by replacing 
all Os with Is, and vice versa. The dual of any true statement (axiom or theorem) in 
Boolean algebra is also a true statement. At this point in the discussion, the reader might 
not appreciate why duality is a useful concept. However, this concept will become clear 
later in the chapter, when we will show that duality implies that at least two different ways 
exist to express every logic function with Boolean algebra. Often, one expression leads to 
a simpler physical implementation than the other and is thus preferable. 


Two- and Three-Variable Properties 

To enable us to deal with a number of variables, it is useful to define some two- and 
three-variable algebraic identities. For each identity, its dual version is also given. These 
identities are often referred to as properties. They are known by the names indicated below. 
If x, y, and z are Boolean variables, then the following properties hold: 
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10a. x-y=y-x Commutative 
106. x+y=yt+x 

lla. x-(y-2z=(Q-y)-z Associative 
lb xt+(yt+2=(@t+y)+z 

12a. x-(y+z)=x-y+x-z Distributive 
12b. x+y-z=(+y)-(*4+2) 

13a. x+x-y=x Absorption 
13b. x-Qt+y)=x 

l4qa. x-y+x-y=x Combining 
14d. (+y)-@+y)=x 

Sa. xX-y=x+y DeMorgan’s theorem 


15b. x+y=X-y 
l6a. x+xX-y=x+y 

16b. x-(®+y)=x-y 

7a. x-yty-Z+x-Z7Z=xX-y4+x-z Consensus 
17b. wt+y)-Ot+z)-@4+29=at+y)-@+2) 


Again, we can prove the validity of these properties either by perfect induction or by 
performing algebraic manipulation. Figure 2.13 illustrates how perfect induction can be 
used to prove DeMorgan’s theorem, using the format of a truth table. The evaluation of 
left-hand and right-hand sides of the identity in 15a gives the same result. 

We have listed a number of axioms, theorems, and properties. Not all of these are 
necessary to define Boolean algebra. For example, assuming that the + and - operations 
are defined, it is sufficient to include theorems 5 and 8 and properties 10 and 12. These 
are sometimes referred to as Huntington’s basic postulates [3]. The other identities can be 
derived from these postulates. 

The preceding axioms, theorems, and properties provide the information necessary for 
performing algebraic manipulation of more complex expressions. 


x y|lx-y|x-yllx | y|xty 
0 0 0 1 1/1 1 
0 1 0 1 1/0 1 
1 0 0 1 0); 1 1 
1. J 1 0 0 0 0 


e_——- >er——- 
LHS RHS 


Figure 2.13 — Proof of DeMorgan’s theorem in 15a. 
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Example 2.3 Let us prove the validity of the logic equation 
(xy +3) + +.%3) = 21 X35 +X] + x3 


The left-hand side can be manipulated as follows. Using the distributive property, 12a, 
gives 


LHS = (4; +3) 1 + (x +3) + x3 


Applying the distributive property again yields 


LHS = x, +X +43 -X) +x, -%3 +23 -%3 


Note that the distributive property allows ANDing the terms in parenthesis in a way analo- 
gous to multiplication in ordinary algebra. Next, according to theorem 8a, the terms x; - X; 
and x3 - x3 are both equal to 0. Therefore, 


LHS = 0+ 4x3 -x; +x; -%3+0 
From 6b it follows that 
LHS = x3 -X) +21 -X3 
Finally, using the commutative property, 10a and 10b, this becomes 
LHS = x, -%3 +X) - x3 


which is the same as the right-hand side of the initial equation. 


Example 2.4 Consider the logic equation 
Xy + X3 +X. +X3 +X -X3 +X + X3 =X XQ +X +X. +X] + XQ 
The left-hand side can be manipulated as follows 


LHS = x1 -%3 +x) +43 +%2-%3+%X2-x3 using 10D 


= xy + (%3 +3) + X2- 3 + x3) using 12a 
=x-1+%.-1 using 8b 
=X + XQ using 6a 


The right-hand side can be manipulated as 


RHS = x1 -X2. +x, -(2+%X2) using 12a 


=X):%.4+x,-1 using 8b 
=X) -X. +x using 6a 
=x, +X) +X using 10b 


=x, +X using 16a 
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Being able to manipulate both sides of the initial equation into identical expressions estab- 
lishes the validity of the equation. Note that the same logic function is represented by either 
the left- or the right-hand side of the above equation; namely 


FOr, X25 43) = xy Xz + XQ XZ + Hy XZ +X +3 
= X1 XQ +X, XQ +X + X2 
As a result of manipulation, we have found a much simpler expression 
f 1, %2, X3) = x1 + X2 


which also represents the same function. This simpler expression would result in a lower- 
cost logic circuit that could be used to implement the function. 


Examples 2.3 and 2.4 illustrate the purpose of the axioms, theorems, and properties 
as a mechanism for algebraic manipulation. Even these simple examples suggest that it is 
impractical to deal with highly complex expressions in this way. However, these theorems 
and properties provide the basis for automating the synthesis of logic functions in CAD 
tools. To understand what can be achieved using these tools, the designer needs to be aware 
of the fundamental concepts. 


2.5.1 THE VENN DIAGRAM 


We have suggested that perfect induction can be used to verify the theorems and properties. 
This procedure is quite tedious and not very informative from the conceptual point of view. 
A simple visual aid that can be used for this purpose also exists. It is called the Venn 
diagram, and the reader is likely to find that it provides for a more intuitive understanding 
of how two expressions may be equivalent. 

The Venn diagram has traditionally been used in mathematics to provide a graphical 
illustration of various operations and relations in the algebra of sets. A set s is a collection 
of elements that are said to be the members of s. In the Venn diagram the elements of 
a set are represented by the area enclosed by a contour such as a square, a circle, or an 
ellipse. For example, in a universe N of integers from | to 10, the set of even numbers is 
E = {2,4, 6, 8, 10}. Acontour representing EF encloses the even numbers. The odd numbers 
form the complement of E; hence the area outside the contour represents E= {1, 3, 5, 7, 9}. 

Since in Boolean algebra there are only two values (elements) in the universe, B = 
{0, 1}, we will say that the area within a contour corresponding to a set s denotes that s = 1, 
while the area outside the contour denotes s = 0. In the diagram we will shade the area 
where s = |. The concept of the Venn diagram is illustrated in Figure 2.14. The universe B 
is represented by a square. Then the constants | and 0 are represented as shown in parts (a) 
and (b) of the figure. A variable, say, x, is represented by a circle, such that the area inside 
the circle corresponds to x = 1, while the area outside the circle corresponds to x = 0. 
This is illustrated in part (c). An expression involving one or more variables is depicted by 
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(a) Constant 1 (b) Constant 0 
(c) Variable x (d) x 


Figure 2.14 — The Venn diagram representation. 


shading the area where the value of the expression is equal to 1. Part (d) indicates how the 
complement of x is represented. 

To represent two variables, x and y, we draw two overlapping circles. Then the area 
where the circles overlap represents the case where x = y = 1, namely, the AND of x and 
y, as shown in part (e). Since this common area consists of the intersecting portions of x 
and y, the AND operation is often referred to formally as the intersection of x and y. Part 
(f) illustrates the OR operation, where x + y represents the total area within both circles, 
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Za (SN 


(Cc) x-(y+z) (f) x-ytx-z 


Figure 2.15 Verification of the distributive property x -(y +z) =x-y+x-z. 


namely, where at least one of x or y is equal to 1. Since this combines the areas in the 
circles, the OR operation is formally often called the union of x and y. 

Part (g) depicts the term x - y, which is represented by the intersection of the area for x 
with that for y. Part () gives a three-variable example; the expression x - y + z is the union 
of the area for z with that of the intersection of x and y. 

To see how we can use Venn diagrams to verify the equivalence of two expressions, 
let us demonstrate the validity of the distributive property, 12a, in Section 2.5. Figure 2.15 
gives the construction of the left and right sides of the identity that defines the property 


Xs (y+z)H=xX-y+x-z 


Part (a) shows the area where x = 1. Part (b) indicates the area for y + z. Part (c) gives the 
diagram for x - (y + z), the intersection of shaded areas in parts (a) and (b). The right-hand 
side is constructed in parts (d), (e), and (f). Parts (d) and (e) describe the terms x - y and 
x +z, respectively. The union of the shaded areas in these two diagrams then corresponds 
to the expression x - y +x - z, as seen in part (f). Since the shaded areas in parts (c) and 
(f) are identical, it follows that the distributive property is valid. 
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As another example, consider the identity 
XeVtR Z+y ZSExX-y+R-zZ 


which is illustrated in Figure 2.16. Notice that this identity states that the term y - z is fully 
covered by the terms x - y and X - z; therefore, this term can be omitted. This identity, which 
we listed earlier as property 17a, is often referred to as consensus. 

The reader should use the Venn diagram to prove some other identities. The examples 
below prove the distributive property 12b, and DeMorgan’s theorem, 15a. 


AS AS 
oS CT 
xy xy 
a 
2 
& 
( 


7S 


X+y+xX-Z+y-Z 


Figure 2.16 Verification of x-y+X-c¢ty-g=x-y+X-z. 
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The distributive property 12a in Figure 2.15 will look familiar to the reader, because it is valid 
both for Boolean variables and for variables that are real numbers. In the case of real-number 
variables, the operations involved would be multiplication and addition, rather than logical 
AND and OR. However, the dual form 125 of this property, x + y-z= (xn+y)-(*+2), 
does not hold for real-number variables involving multiplication and addition operations. 
To prove that this identity is valid in Boolean algebra we can use the Venn diagrams in 
Figure 2.17. Parts (a) and (b) of the figure depict the terms x and y - z, respectively, and 
part (c) gives the union of parts (a) and (b). Parts (d) and (e) depict the sum terms (x + y) 
and (x + z), and part (f) shows the intersection of (d) and (e). Since the diagrams in (c) 
and (f) are the same, this proves the identity. 


(c) x+y-z (f) @+y): @t+2) 


Figure 2.17 Proof of the distributive property 12b. 


Example 2.5 


42 


CHAPTER 2 . INTRODUCTION TO LOGIC CIRCUITS 


(a) x+y (c) x 
(b) x-y (J) ¥ 
(e) x+y 


Figure 2.18 — Proof of DeMorgan’s theorem 15a. 


Example 2.6 


A proof of DeMorgan’s theorem 15a by using Venn diagrams is illustrated in Figure 2.18. 
The diagram in part (b) of the figure, which is the complement of x - y, is the same as the 
diagram in part (e), which is the union of part (c) with part (d), thus proving the theorem. 
We leave it as an exercise for the reader to prove the dual form of DeMorgan’s theorem, 
15D. 


2.5.2 NOTATION AND TERMINOLOGY 


Boolean algebra is based on the AND and OR operations, for which we have adopted 
the symbols - and +, respectively. These are also the standard symbols for the familiar 
arithmetic multiplication and addition operations. Considerable similarity exists between 
the Boolean operations and the arithmetic operations, which is the main reason why the 
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same symbols are used. In fact, when single digits are involved there is only one significant 
difference; the result of 1 + 1 is equal to 2 in ordinary arithmetic, whereas it is equal to | 
in Boolean algebra as defined by theorem 7b in Section 2.5. 

Because of the similarity with the arithmetic addition and multiplication operations, 
the OR and AND operations are often called the logical sum and product operations. Thus 
x1 +X is the logical sum of x; and x2, and x, - x2 is the logical product of x; and x2. Instead 
of saying “logical product” and “logical sum,” it is customary to say simply “product” and 
“sum.” Thus we say that the expression 


Xy +X + x3 +X, + X4 + XQ +xZ + X4 
is a sum of three product terms, whereas the expression 
(% +3) + (1 + 3) - 2 +3 + x4) 


is a product of three sum terms. 


2.5.3 PRECEDENCE OF OPERATIONS 


Using the three basic operations—AND, OR, and NOT—1t is possible to construct an infinite 
number of logic expressions. Parentheses can be used to indicate the order in which the 
operations should be performed. However, to avoid an excessive use of parentheses, another 
convention defines the precedence of the basic operations. It states that in the absence of 
parentheses, operations in a logic expression must be performed in the order: NOT, AND, 
and then OR. Thus in the expression 


Xp +X. +X, - Xo 


it is first necessary to generate the complements of x; and x7. Then the product terms x; - x2 
and x1 - X2 are formed, followed by the sum of the two product terms. Observe that in the 
absence of this convention, we would have to use parentheses to achieve the same effect as 
follows: 


(x1 + X2) + (1) + %2)) 


Finally, to simplify the appearance of logic expressions, it is customary to omit the - 
operator when there is no ambiguity. Therefore, the preceding expression can be written as 


1X2 + X1X2 


We will use this style throughout the book. 
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Armed with some basic ideas, we can now try to implement arbitrary functions using the 
AND, OR, and NOT gates. Suppose that we wish to design a logic circuit with two inputs, 
x, and x2. Assume that x; and x2 represent the states of two switches, either of which may 
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Xp Xq |] f(xy, xX) 


- ee OO 
- Or Oo 
Fe Ore 


Figure 2.19 —_ A function to be synthesized. 


produce a 0 or 1. The function of the circuit is to continuously monitor the state of the 
switches and to produce an output logic value 1 whenever the switches (x1, x2) are in states 
(0, 0), (0, 1), or C1, 1). If the state of the switches is (1, 0), the output should be 0. We can 
express the required behavior using a truth table, as shown in Figure 2.19. 

A possible procedure for designing a logic circuit that implements this truth table is to 
create a product term that has a value of 1 for each valuation for which the output function 
f has to be 1. Then we can take a logical sum of these product terms to realize f. Let us 
begin with the fourth row of the truth table, which corresponds to x; = x2 = 1. The product 
term that is equal to | for this valuation is x; - x2, which is just the AND of x; and x2. Next 
consider the first row of the table, for which x; = x2 = 0. For this valuation the value 1 is 
produced by the product term x; - x2. Similarly, the second row leads to the term x, - x2. 
Thus f may be realized as 


f 1, X2) = x1%2 + X1X2 + XX 
The logic network that corresponds to this expression is shown in Figure 2.20a. 
Although this network implements f correctly, it is not the simplest such network. To 
find a simpler network, we can manipulate the obtained expression using the theorems and 


properties from Section 2.5. According to theorem 7b, we can replicate any term in a logical 
sum expression. Replicating the third product term, the above expression becomes 


SOL, X2) = XyX2 +X] X2 + XX + Xp x2 


Using the commutative property 10b to interchange the second and third product terms 
gives 


F G1, X2) = x1%2 + X1xQ + X1XQ + X 1x2 
Now the distributive property 12a allows us to write 


Sy, Xz) = (Hy +X )x2 +X] (X2 + X2) 
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xy 


x9 


(a) Canonical sum-of-products 


(b) Minimal-cost realization 


Figure 2.20 —Two implementations of the function in Figure 2.19. 


Applying theorem 8b we get 
f 1, %2) =1-x2+%1-1 
Finally, theorem 6a leads to 
f M1, %2) = 4X2 +X 


The network described by this expression is given in Figure 2.20b. Obviously, the cost of 
this network is much less than the cost of the network in part (a) of the figure. 

This simple example illustrates two things. First, a straightforward implementation of 
a function can be obtained by using a product term (AND gate) for each row of the truth 
table for which the function is equal to 1. Each product term contains all input variables, and 
it is formed such that if the input variable x; is equal to | in the given row, then x; is entered 
in the term; if x; = O in that row, then x; is entered. The sum of these product terms realizes 
the desired function. Second, there are many different networks that can realize a given 
function. Some of these networks may be simpler than others. Algebraic manipulation can 
be used to derive simplified logic expressions and thus lower-cost networks. 

The process whereby we begin with a description of the desired functional behavior 
and then generate a circuit that realizes this behavior is called synthesis. Thus we can 
say that we “synthesized” the networks in Figure 2.20 from the truth table in Figure 2.19. 
Generation of AND-OR expressions from a truth table is just one of many types of synthesis 
techniques that we will encounter in this book. 
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Example 2.7 


Figure 2.21a depicts a part of a factory that makes bubble gumballs. The gumballs travel 
on a conveyor that has three associated sensors 51, 52, and s3. The sensor s; is connected 
to a scale that weighs each gumball, and if a gumball is not heavy enough to be acceptable 
then the sensor sets s; = 1. Sensors sz and s3 examine the diameter of each gumball. If 
a gumball is too small to be acceptable, then s2 = 1, and if it is too large, then s3 = 1. 
If a gumball is of an acceptable weight and size, then the sensors give s; = sz = s3 = 0. 
The conveyor pushes the gumballs over a “trap door” that it used to reject the ones that 
are not properly formed. A gumball should be rejected if it is too large, or both too small 
and too light. The trap door is opened by setting the logic function f to the value 1. By 
inspection, we can see that an appropriate logic expression is f = 5152 + 53. We will use 
Boolean algebra to derive this logic expression from the truth table. 

The truth table for f is given in Figure 2.21b. It sets f to | for each row in the table 
where s3 has the value | (too large), as well as for each row where s; = 52 = | (too light and 
too small). As described previously, a logic expression for f can be formed by including a 
product term for each row where f = 1. Thus, we can write 


f = 515253 + 515283 + 815253 + 518753 + 518253 


a gumball L/ 


(a) Conveyor and sensors 


Sy Sq $3 f 
0 0 0 0 
0 0 1 1 
0 1 0 0 
01 1 1 
1 0 0 0 
1 0 1 1 
1 1 0 1 
1 1 1 1 


(b) Truth table 


Figure 2.21 = A bubble gumball factory. 
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We can use algebraic manipulation to simplify this expression in a number of ways. For 
example, as shown below, we can first use rule 7b to repeat the term s5;5253, and then use 
the distributive property 12a and rule 8b to simplify the expression 
f = 515253 + 515253 + 515253 + 518283 + 815253 + 515253 

= 51 83(S2 + $2) + 5153(S2 + $2) + 8152(53 +53) 

= $153 + 5153 + S152 
Now, using the combining property 14a on the first two product terms gives 

f = 53 +8152 


The observant reader will notice that using the combining property 14a is really just a short 
form of first using the distributive property 12a and then applying rule 8b, as we did in the 
previous step. Our simplified expression for f is the same as the one that we determined 
earlier, by inspection. 
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There are different ways in which we can simplify the logic expression produced from the 
truth table in Figure 2.21b. Another approach is to first repeat the term 5; 5253, as we did in 
Example 2.7, and then proceed as follows 
Ff = 518283 + 518253 + 815253 + 515283 + 515283 + 518283 

= 53(S152 + S152 + 5182 + 5182) + 5152(53 + 53) 

= §3-1+ s,s. 

= $3 + S182 
Here, we used the distributive property 12a to produce the expression (5152 + $152 + 5152 + 


5,52). Since this expression includes all possible valuations of 51, 52, itis equal to 1, leading 
to the same expression for f that we derived before. 


Example 2.8 


Yet another way of producing the symplified logic expression is shown below. 


f = 515253 + 515283 + 515283 + 515253 + 515253 
= 515253 + 818283 + 515253 + 815253 + 915283 + 515253 
= 5153 (52 + 52) + 5253(51 + 51) + 8152(53 + 53) 
= $183 + 5253 + $182 
= 53(51 + 82) + S152 
= 53(5152) + 5152 
= $3 + $152 
In this solution, we first repeat the term 515253, and then symplify to generate the expression 


53(S1 + 52) +5152. Using DeMorgan’s theorem 15a we can replace (s; + 52) with (5752), 
which can then be deleted by applying property 16a. 


Example 2.9 
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As illustrated by Examples 2.7 to 2.9, there are multiple ways in which a logic expres- 
sion can be minimized by using Boolean algebra. This process can be daunting, because it 
is not obvious which rules, identities, and properties should be applied, and in what order. 
Later in this chapter, in Section 2.11, we will introduce a graphical technique, called the 
Karnaugh map, that clarifies this process by providing a systematic way of generating a 
minimal-cost logic expression for a function. 


2.6.1 SUM-OF-PRODUCTS AND PRODUCT-OF-SUMS FORMS 


Having introduced the synthesis process by means of simple examples, we will now present 
it in more formal terms using the terminology that is encountered in the technical literature. 
We will also show how the principle of duality, which was introduced in Section 2.5, applies 
broadly in the synthesis process. 

If a function f is specified in the form of a truth table, then an expression that realizes 
f can be obtained by considering either the rows in the table for which f = 1, as we have 
already done, or by considering the rows for which f = 0, as we will explain shortly. 


Minterms 

For a function of n variables, a product term in which each of the n variables appears 
once is called a minterm. The variables may appear in a minterm either in uncomplemented 
or complemented form. For a given row of the truth table, the minterm is formed by 
including x; if x; = 1 and by including x; if x; = 0. 

To illustrate this concept, consider the truth table in Figure 2.22. We have numbered the 
rows of the table from 0 to 7, so that we can refer to them easily. From the discussion of the 
binary number representation in Section 1.5, we can observe that the row numbers chosen 
are just the numbers represented by the bit patterns of variables x, x2, and x3. The figure 
shows all minterms for the three-variable table. For example, in the first row the variables 


Row 
number | x; XX, 3 Minterm Maxterm 
0 0 O O |) mo = X1X2xX3 | My = xX, + Xo + X35 
1 O O 1 |i my = XyxXox3 | My, = x, + xX) + X3 
2 O LO | my = X4xX_X3 | My = x, + X20 + X35 
3 O Ll 1 |) mg = xyxx3, | My = x, + X24 x, 
4 1 O O |) mg = X1X2xX3 | My = x1) + X02 + 3 
5 1 O 1 |) ms = xXx; | M5 = x) + X) + X3 
6 LoL OO |) mg = x4%x3 | Me = X) + X0 + 3 
7 1 1 1 Mz = XXX, | MZ = Xx, + X.+ X3 


Figure 2.22 — Three-variable minterms and maxterms. 
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have the values x; = x2 = x3 = 0, which leads to the minterm X|X.x3. In the second row 
xX, = X2 = 0 and x3 = 1, which gives the minterm X,X2x3, and so on. To be able to refer to 
the individual minterms easily, it is convenient to identify each minterm by an index that 
corresponds to the row numbers shown in the figure. We will use the notation m; to denote 
the minterm for row number i. Thus mo = X1X2X3, my = X1X2x3, and so on. 


Sum-of-Products Form 


A function f can be represented by an expression that is a sum of minterms, where each 
minterm is ANDed with the value of f for the corresponding valuation of input variables. For 
example, the two-variable minterms are mp = X1X2, m1 = X 1X2, M2 = X1X2, and m3 = X 1X0. 
The function in Figure 2.19 can be represented as 


f=m-l+m-1l+m-0+m-1 
=m +m, +m3 
= XX. + X1X2 + X1xX2 


which is the form that we derived in the previous section using an intuitive approach. Only 
the minterms that correspond to the rows for which f = | appear in the resulting expression. 

Any function f can be represented by a sum of minterms that correspond to the rows 
in the truth table for which f = 1. The resulting implementation is functionally correct and 
unique, but it is not necessarily the lowest-cost implementation of f. A logic expression 
consisting of product (AND) terms that are summed (ORed) is said to be in the sum-of- 
products (SOP) form. If each product term is a minterm, then the expression is called a 
canonical sum-of-products for the functionf . As we have seen in the example of Figure 2.20, 
the first step in the synthesis process is to derive a canonical sum-of-products expression 
for the given function. Then we can manipulate this expression, using the theorems and 
properties of Section 2.5, with the goal of finding a functionally equivalent sum-of-products 
expression that has a lower cost. 

As another example, consider the three-variable function f (x1, x2, x3), specified by the 
truth table in Figure 2.23. To synthesize this function, we have to include the minterms 7, 


Row 
number | x, XxX, x3 |] f(%,. xX, x3) 
0 0 O O 0 
1 0 oO 1 1 
2 0 1 O 0 
3 Oo 1 1 0 
4 1 0 O 1 
5 1 oO 1 1 
6 1 1 0 1 
7 1 1 1 0 


Figure 2.23 A three-variable function. 
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m4, Ms, and mg. Copying these minterms from Figure 2.22 leads to the following canonical 
sum-of-products expression for f 


FOL, X2, X3) = Xp X23 + AyA2X3 + XX2X3 + X1X2X3 
This expression can be manipulated as follows 


SF = (&1 + x1) X2x3 + x1 (X2 + X2)X3 
= 1-X5x3 +X, -1-% 


= X2X3 + X1X3 


This is the minimum-cost sum-of-products expression for f. It describes the circuit shown 
in Figure 2.24a. A good indication of the cost of a logic circuit is the total number of gates 
plus the total number of inputs to all gates in the circuit. Using this measure, the cost of 
the network in Figure 2.24a is 13, because there are five gates and eight inputs to the gates. 
By comparison, the network implemented on the basis of the canonical sum-of-products 
would have a cost of 27; from the preceding expression, the OR gate has four inputs, each 
of the four AND gates has three inputs, and each of the three NOT gates has one input. 
Minterms, with their row-number subscripts, can also be used to specify a given func- 
tion in a more concise form. For example, the function in Figure 2.23 can be specified 


f 
+3 
x 
(a) A minimal sum-of-products realization 
xy 
+3 
f 


» ——_o 


(b) A minimal product-of-sums realization 


Figure 2.24 — Two realizations of the function in Figure 2.23. 
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as 
f 1, 2,3) = Sm, ms, ms, M6) 
or even more simply as 
f (1.32.53) = )) (1, 4, 5, 6) 


The }~ sign denotes the logical sum operation. This shorthand notation is often used in 
practice. 


Consider the function Example 2.10 
f (1, x2,%3) = D> m(2, 3, 4, 6, 7) 
The canonical SOP expression for the function is derived using minterms 


f =m+m3 + m4 + m6 + m7 


= X1XoX3 + Xp x23 + XyX2K3 + Xp X2K3 + X1X2%3 
This expression can be simplified using the identities in Section 2.5 as follows 
f = X12 (%3 + x3) + x1 Ko + x2)X3 + x1x2(%3 + x3) 
= XjXQ + .X1X3 + Xj X2 
= (X1 + X4)x2 + x13 


=X. + X1X3 


Suppose that a four-variable function is defined by Example 2.11 
SF (1, X2,X3,.X%4) = S> (3, 7,9, 12, 13, 14, 15) 
The canonical SOP expression for this function is 
f= XyX2X3Xq + Hyg + Ay AAZX4 A Hy AYAZX4 A Xp + A AQAZA4 + Ay AA3XY 
A simpler SOP expression can be obtained as follows 


f =X, 2 + X2)x3%4 +X] (Xo + 2) 3x4 + Hy X2NZ3 (Hq + X4) + Hy XDXz(X4 + x4) 
= X1X3Xq + XY XZXq +X X2HZ + XYX2X3 
= X1X3X4 + XyX3Xq + XyX2(X3 + x3) 


= X1xX3X4 +X X3X4 + XX 


52 


CHAPTER 2 . INTRODUCTION TO LOGIC CIRCUITS 


Maxterms 

The principle of duality suggests that if it is possible to synthesize a function f by 
considering the rows in the truth table for which f = 1, then it should also be possible to 
synthesize f by considering the rows for which f = 0. This alternative approach uses the 
complements of minterms, which are called maxterms. All possible maxterms for three- 
variable functions are listed in Figure 2.22. We will refer to a maxterm Mj by the same row 
number as its corresponding minterm m; as shown in the figure. 

Product-of-Sums Form 


If a given function f is specified by a truth table, then its complement f can be rep- 
resented by a sum of minterms for which f = 1, which are the rows where f = 0. For 
example, for the function in Figure 2.19 


f (x1, x2) = m 
= X1X2 
If we complement this expression using DeMorgan’s theorem, the result is 
f =f = xx. 
=X, +x 


Note that we obtained this expression previously by algebraic manipulation of the canonical 
sum-of-products form for the function f. The key point here is that 


f =m = M) 


where M, is the maxterm for row 2 in the truth table. 
As another example, consider again the function in Figure 2.23. The complement of 
this function can be represented as 


Ff (x1, x2, x3) = mo + m2 + m3 + my 


= X1X2X3 +X XQH3 +X XQH3 + X1X2X3 


Then f can be expressed as 


f =m +m,+m3 + m7 
= Mo + M2 -M3 + M7 
= Mo - M2 -M3-M, 
= (X] + X2 +.X3)(%1 + Xz + 23) (1 + X2 + 3) (K+ X2 + X3) 


This expression represents f as a product of maxterms. 

A logic expression consisting of sum (OR) terms that are the factors of a logical product 
(AND) is said to be of the product-of-sums (POS) form. If each sum term is a maxterm, then 
the expression is called a canonical product-of-sums for the given function. Any function 
f can be synthesized by finding its canonical product-of-sums. This involves taking the 
maxterm for each row in the truth table for which f = 0 and forming a product of these 
maxterms. 
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Returning to the preceding example, we can attempt to reduce the complexity of the 
derived expression that comprises a product of maxterms. Using the commutative property 
10b and the associative property 11b from Section 2.5, this expression can be written as 


fF = (G1 + 3) + x2) (1 + 3) + 2) (01 + (2 + X3)) 1 + 2 + %3)) 
Then, using the combining property 14, the expression reduces to 
f = (1 +.%3) G2 + X3) 


The corresponding network is given in Figure 2.24b. The cost of this network is 13. While 
this cost happens to be the same as the cost of the sum-of-products version in Figure 2.24a, 
the reader should not assume that the cost of a network derived in the sum-of-products form 
will in general be equal to the cost of a corresponding circuit derived in the product-of-sums 
form. 

Using the shorthand notation, an alternative way of specifying our sample function is 


f 1, X2,.%3) = T1(Mo, M2, M3, M7) 
or more simply 
f (1, X2,x3) = TIM (0, 2, 3, 7) 


The IT sign denotes the logical product operation. 

The preceding discussion has shown how logic functions can be realized in the form 
of logic circuits, consisting of networks of gates that implement basic functions. A given 
function may be realized with various different circuit structures, which usually implies 
a difference in cost. An important objective for a designer is to minimize the cost of the 
designed circuit. We will discuss strategies for finding minimum-cost implementations in 
Section 2.11. 
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Consider again the function in Example 2.10. Instead of using the minterms, we can specify Example 2.12 


this function as a product of maxterms for which f = 0, namely 
f (x1, x2, x3) = TIM (QO, 1, 5) 
Then, the canonical POS expression is derived as 
f =Mo-M, - Ms 
= (X1 + X2 +3) (%1 + 2X2 + X3)(%1 + X2 + %3) 
A simplified POS expression can be derived as 
f= 1 +2 + X3) 1 + Xo +3) 1 +X + 2X3) + x + Xs) 

= (+ X2) +03) (CX + X2) +3) 01 + 2 + %3)) 1 + 2 + %3)) 

= (1 + x2) + 3X3) (x41 + (X2 + X3)) 

= (1 + X2)(X2 + 3) 
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Another way of deriving this product-of-sums expression is to use the sum-of-products form 
of f. Thus, 


F (x1, x2, x3) = > m@, 1, 5) 


= Xj XQX3 + X1X2X3 + X1X2X3 


= X1X2X3 +X XQX3 +X X2XZ +X XX3 
= X1X2(%3 + x3) + X2x3(%1 + x1) 


= X1X2 + X2X3 


Now, first applying DeMorgan’s theorem 15), and then applying 15a (twice) gives 


f=f 
= (X1X2 + X2x3) 
= (¥1X2) 2x3) 
= (x1 + X2) (2 + X3) 
To see that this product-of-sums expression for f is equivalent to the sum-of-products 
expression that we derived in Example 2.10, we can slightly rearrange our expression as 


Ff = 2 + x1) G2 + X3). Now, recognizing that this expression has the form of the righthand 
side of the distributive property 12b, we have the sum-of-products expressionf = x2 + x1X3. 


2.7 NAND ann NOR Locic NETWORKS 


We have discussed the use of AND, OR, and NOT gates in the synthesis of logic circuits. 
There are other basic logic functions that are also used for this purpose. Particularly useful 
are the NAND and NOR functions which are obtained by complementing the output gener- 
ated by AND and OR operations, respectively. These functions are attractive because they 
are implemented with simpler electronic circuits than the AND and OR functions, as we 
discuss in Appendix B. Figure 2.25 gives the graphical symbols for the NAND and NOR 
gates. A bubble is placed on the output side of the AND and OR gate symbols to represent 
the complemented output signal. 

If NAND and NOR gates are realized with simpler circuits than AND and OR gates, 
then we should ask whether these gates can be used directly in the synthesis of logic circuits. 
In Section 2.5 we introduced DeMorgan’s theorem. Its logic gate interpretation is shown in 
Figure 2.26. Identity 15a is interpreted in part (a) of the figure. It specifies that a NAND of 
variables x; and x2 is equivalent to first complementing each of the variables and then ORing 
them. Notice on the far-right side that we have indicated the NOT gates simply as bubbles, 
which denote inversion of the logic value at that point. The other half of DeMorgan’s 
theorem, identity 15), appears in part (b) of the figure. It states that the NOR function is 
equivalent to first inverting the input variables and then ANDing them. 


2.7 


(b) NOR gates 


Figure 2.25 = NAND and NOR gates. 


Do TEED © =D 
“ye = D> = -f>- 


(b) x) +X) = XX) 


NAND anp NOR Locic NETWORKS 


Xy HX Fe +X, 


Figure 2.26 = DeMorgan’s theorem in terms of logic gates. 
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In Section 2.6 we explained how any logic function can be implemented either in sum- 
of-products or product-of-sums form, which leads to logic networks that have either an 
AND-OR or an OR-AND structure, respectively. We will now show that such networks 
can be implemented using only NAND gates or only NOR gates. 

Consider the network in Figure 2.27 as a representative of general AND-OR networks. 
This network can be transformed into a network of NAND gates as shown in the figure. 
First, each connection between the AND gate and an OR gate is replaced by a connection 
that includes two inversions of the signal: one inversion at the output of the AND gate and 
the other at the input of the OR gate. Such double inversion has no effect on the behavior 
of the network, as stated formally in theorem 9 in Section 2.5. According to Figure 2.26a, 
the OR gate with inversions at its inputs is equivalent toa NAND gate. Thus we can redraw 
the network using only NAND gates, as shown in Figure 2.27. This example shows that 
any AND-OR network can be implemented as a NAND-NAND network having the same 
topology. 

Figure 2.28 gives a similar construction for a product-of-sums network, which can be 
transformed into a circuit with only NOR gates. The procedure is exactly the same as the 
one described for Figure 2.27 except that now the identity in Figure 2.26b is applied. The 
conclusion is that any OR-AND network can be implemented as a NOR-NOR network 
having the same topology. 


Example 2.13 


Let us implement the function 


f (x1, ¥2,23) = D> m2, 3, 4, 6,7) 


using NOR gates only. In Example 2.12 we showed that the function can be represented 
by the POS expression 


f = (1 + X2) (x2 +3) 


An OR-AND circuit that corresponds to this expression is shown in Figure 2.29a. Using 
the same structure of the circuit, a NOR-gate version is given in Figure 2.29b. Note that x3 
is inverted by a NOR gate that has its inputs tied together. 


Example 2.14 


Let us now implement the function 
f (x1, 22,23) = D> m(2, 3, 4, 6, 7) 
using NAND gates only. In Example 2.10 we derived the SOP expression 
f =x. +4x1%3 


which is realized using the circuit in Figure 2.30a. We can again use the same structure 
to obtain a circuit with NAND gates, but with one difference. The signal x2 passes only 
through an OR gate, instead of passing through an AND gate and an OR gate. If we simply 


2.7. NAND anpb NOR Logic NETWORKS 
xy x 
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X4 X4 
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Figure 2.27 — Using NAND gates to implement a sum-of-products. 
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Figure 2.28 — Using NOR gates to implement a product-of-sums. 
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(a) POS implementation 


(b) NOR implementation 


Figure 2.29 = NOR-gate realization of the function in Example 2.13. 


(a) SOP implementation 


x) 
X3 


(b) NAND implementation 


Figure 2.30 = NAND-gate realization of the function in Example 2.10. 


2.8 DESIGN EXAMPLES 


replace the OR gate with a NAND gate, this signal would be inverted which would result 
in a wrong output value. Since x2 must either not be inverted, or it can be inverted twice, 
we can pass it through two NAND gates as depicted in Figure 2.30b. Observe that for this 
circuit the output fis 


ff = X2 + x1%3 
Applying DeMorgan’s theorem, this expression becomes 


f = x2 + «1X3 


2.8 DESIGN EXAMPLES 


Logic circuits provide a solution to a problem. They implement functions that are needed to 
carry out specific tasks. Within the framework of a computer, logic circuits provide complete 
capability for execution of programs and processing of data. Such circuits are complex and 
difficult to design. But regardless of the complexity of a given circuit, a designer of logic 
circuits is always confronted with the same basic issues. First, it is necessary to specify the 
desired behavior of the circuit. Second, the circuit has to be synthesized and implemented. 
Finally, the implemented circuit has to be tested to verify that it meets the specifications. 
The desired behavior is often initially described in words, which then must be turned into 
a formal specification. In this section we give three simple examples of design. 


2.8.1 THREE-Way LIGHT CONTROL 


Assume that a large room has three doors and that a switch near each door controls a light 
in the room. It has to be possible to turn the light on or off by changing the state of any one 
of the switches. 

As a first step, let us turn this word statement into a formal specification using a truth 
table. Let x;, x2, and x3 be the input variables that denote the state of each switch. Assume 
that the light is off if all switches are open. Closing any one of the switches will turn the 
light on. Then turning on a second switch will have to turn off the light. Thus the light 
will be on if exactly one switch is closed, and it will be off if two (or no) switches are 
closed. If the light is off when two switches are closed, then it must be possible to turn 
it on by closing the third switch. If f (x1, x2, x3) represents the state of the light, then the 
required functional behavior can be specified as shown in the truth table in Figure 2.31. 
The canonical sum-of-products expression for the specified function is 


f =m +m +m4+ m7 


= X1XQN3 + X1X2XZ + AXjX2X3 + X1X2X3 


This expression cannot be simplified into a lower-cost sum-of-products expression. The 
resulting circuit is shown in Figure 2.32a. 
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Figure 2.31 Truth table for the three-way light 
control. 


An alternative realization for this function is in the product-of-sums form. The canon- 
ical expression of this type is 


f =Mo- M3: Ms - Mo 
= (x1 + 2X2 +.3) (1 +X. + 2X3) + x2 +3) 1 + X2 + 23) 


The resulting circuit is depicted in Figure 2.32b. It has the same cost as the circuit in part 
(a) of the figure. 

When the designed circuit is implemented, it can be tested by applying the various 
input valuations to the circuit and checking whether the output corresponds to the values 
specified in the truth table. A straightforward approach is to check that the correct output 
is produced for all eight possible input valuations. 


2.8.2 MULTIPLEXER CIRCUIT 


In computer systems it is often necessary to choose data from exactly one of a number 
of possible sources. Suppose that there are two sources of data, provided as input signals 
x; and x. The values of these signals change in time, perhaps at regular intervals. Thus 
sequences of Os and Is are applied on each of the inputs x; and x2. We want to design a 
circuit that produces an output that has the same value as either x; or x2, dependent on the 
value of a selection control signal s. Therefore, the circuit should have three inputs: x,, 
x2, and s. Assume that the output of the circuit will be the same as the value of input x, if 
s = 0, and it will be the same as x, if s = 1. 

Based on these requirements, we can specify the desired circuit in the form of a truth 
table given in Figure 2.33a. From the truth table, we derive the canonical sum of products 


Sf (S, x1, X2) = SxyXo + SxyxX2 + SXpxX2 + $xyxX2 


2.8 DESIGN EXAMPLES 


f 
al 
#9 
%3 
(a) Sum-of-products realization 
X3 
Xo 
mal 
f 


(b) Product-of-sums realization 


Figure 2.32 —_ Implementation of the function in Figure 2.31. 


Using the distributive property, this expression can be written as 
f = 5x1 %2 + X2) + 81 + x1) x2 
Applying theorem 8d yields 
f =sx,-1l+s-1-x 
Finally, theorem 6a gives 


Sf = 5x1 + 5x2 
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SX Xo F (8,1, %9) 
000 0 
001 0 
010 1 
O11 1 
100 0 
101 1 
110 0 
111 1 


(a) Truth table 


“I Ss 
f xy 
f 
s a) 
X95 
(b) Circuit (c) Graphical symbol 
Ss F(S,X1, X2) 
0 xy 
1 X5 
(d) More compact truth-table representation 
Figure 2.33 — Implementation of a multiplexer. 


A circuit that implements this function is shown in Figure 2.33b. Circuits of this type are 
used so extensively that they are given a special name. A circuit that generates an output 
that exactly reflects the state of one of a number of data inputs, based on the value of one 
or more selection control inputs, is called a multiplexer. We say that a multiplexer circuit 
“multiplexes” input signals onto a single output. 


2.8 DESIGN EXAMPLES 


In this example we derived a multiplexer with two data inputs, which is referred to 
as a “2-to-1 multiplexer.” A commonly used graphical symbol for the 2-to-1 multiplexer 
is shown in Figure 2.33c. The same idea can be extended to larger circuits. A 4-to-1 
multiplexer has four data inputs and one output. In this case two selection control inputs 
are needed to choose one of the four data inputs that is transmitted as the output signal. An 
8-to-1 multiplexer needs eight data inputs and three selection control inputs, and so on. 

Note that the statement “f = x, if s = 0, and f = x, if s = 1” can be presented in a 
more compact form of a truth table, as indicated in Figure 2.33d. In later chapters we will 
have occasion to use such representation. 

We showed how a multiplexer can be built using AND, OR, and NOT gates. The same 
circuit structure can be used to implement the multiplexer using NAND gates, as explained 
in Section 2.7. In Appendix B we will show other possibilities for constructing multiplexers. 
In Chapter 4 we will discuss the use of multiplexers in considerable detail. 


2.8.3 NUMBER DISPLAY 


In Example 2.2 we designed an adder circuit that generates the arithmetic sum S = a + b, 
where a and b are one-bit numbers and S = s 50 provides the resulting two-bit sum, which 
is either 00, 01, or 10. In this design example we wish to create a logic circuit that drives 
a familiar seven-segment display, as illustrated in Figure 2.34a. This display allows us to 
show the value of S as a decimal number, either 0, 1, or 2. The display includes seven 


So Logic 
circuit 


(b) Truth table 


Figure 2.34 _ Display of numbers. 
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segments, labeled a, b,..., g in the figure, where each segment is a light-emitting diode 
(LED). Our logic circuit has the two inputs s; and so. It produces seven outputs, one for 
each segment in the display. Setting an output to the value | illuminates the corresponding 
segment in the display. By illuminating specific segments for each valuation of s;59 we can 
make the display’s appearance correspond to the shape of the appropriate decimal digit. 

Part (b) of Figure 2.34 shows a truth table for the possible valuations of 5,59 and 
indicates on the lefthand side how the display should appear in each case. The truth table 
specifies the logic values needed for each of the seven functions. For example, segment a 
in the 7-segment display needs to be turned on when S has the decimal values 0 or 2, but 
has to be off when S has the value 1. Hence, the corresponding logic function is set to | for 
minterms mo and myo, giving a = 5159 + 5159 = So. Logic expressions for each of the seven 
functions are: 


a=d=e=%8 9 
b=1 

c=S3 

f = 5150 

& = 5150 


Designers of logic circuits rely heavily on CAD tools. We want to encourage the reader 
to become familiar with CAD tools as soon as possible. We have reached a point where 
an introduction to these tools is useful. The next section presents some basic concepts that 
are needed to use these tools. We will also introduce, in Section 2.10, a special language 
for describing logic circuits, called Verilog. This language is used to describe the circuits 
as an input to the CAD tools, which then proceed to derive a suitable implementation. 


2.9 INTRODUCTION TO CAD TooLs 


The preceding sections introduced a basic approach for synthesis of logic circuits. A de- 
signer could use this approach manually for small circuits. However, logic circuits found 
in complex systems, such as today’s computers, cannot be designed manually—they are 
designed using sophisticated CAD tools that automatically implement the synthesis tech- 
niques. 

To design a logic circuit, a number of CAD tools are needed. They are usually packaged 
together into a CAD system, which typically includes tools for the following tasks: design 
entry, logic synthesis and optimization, simulation, and physical design. We will introduce 
some of these tools in this section and will provide additional discussion in later chapters. 


2.9.1 DESIGN ENTRY 
The starting point in the process of designing a logic circuit is the conception of what the 


circuit is supposed to do and the formulation of its general structure. This step is done 
manually by the designer because it requires design experience and intuition. The rest 


2.9 INTRODUCTION TO CAD TooLs 


of the design process is done with the aid of CAD tools. The first stage of this process 
involves entering into the CAD system a description of the circuit being designed. This 
stage is called design entry. We will describe two design entry methods: using schematic 
capture and writing source code in a hardware description language. 


Schematic Capture 


A logic circuit can be defined by drawing logic gates and interconnecting them with 
wires. ACAD tool for entering a designed circuit in this way is called a schematic capture 
tool. The word schematic refers to a diagram of a circuit in which circuit elements, such 
as logic gates, are depicted as graphical symbols and connections between circuit elements 
are drawn as lines. 

A schematic capture tool uses the graphics capabilities of a computer and a computer 
mouse to allow the user to draw a schematic diagram. To facilitate inclusion of gates 
in the schematic, the tool provides a collection of graphical symbols that represent gates 
of various types with different numbers of inputs. This collection of symbols is called a 
library. The gates in the library can be imported into the user’s schematic, and the tool 
provides a graphical way of interconnecting the gates to create a logic network. 

Any subcircuits that have been previously created can be represented as graphical 
symbols and included in the schematic. In practice it is common for a CAD system user to 
create a circuit that includes within it other smaller circuits. This methodology is known 
as hierarchical design and provides a good way of dealing with the complexities of large 
circuits. 

The schematic-capture method is simple to use, but becomes awkward when large 
circuits are involved. A better method for dealing with large circuits is to write source code 
using a hardware description language to represent the circuit. 


Hardware Description Languages 


A hardware description language (HDL) is similar to a typical computer programming 
language except that an HDL is used to describe hardware rather than a program to be 
executed on a computer. Many commercial HDLs are available. Some are proprietary, 
meaning that they are provided by a particular company and can be used to implement 
circuits only in the technology offered by that company. We will not discuss the proprietary 
HDLs in this book. Instead, we will focus on a language that is supported by virtually 
all vendors that provide digital hardware technology and is officially endorsed as an Insti- 
tute of Electrical and Electronics Engineers (IEEE) standard. The IEEE is a worldwide 
organization that promotes technical activities to the benefit of society in general. One of 
its activities involves the development of standards that define how certain technological 
concepts can be used in a way that is suitable for a large body of users. 

Two HDLs are IEEE standards: Verilog HDL and VHDL (Very High Speed Integrated 
Circuit Hardware Description Language). Both languages are in widespread use in the 
industry. We use Verilog in this book, but a VHDL version of the book is also available 
from the same publisher [4]. Although the two languages differ in many ways, the choice 
of using one or the other when studying logic circuits is not particularly important, because 
both offer similar features. Concepts illustrated in this book using Verilog can be directly 
applied when using VHDL. 
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In comparison to performing schematic capture, using Verilog offers a number of advan- 
tages. Because it is supported by most organizations that offer digital hardware technology, 
Verilog provides design portability. A circuit specified in Verilog can be implemented in dif- 
ferent types of chips and with CAD tools provided by different companies, without having 
to change the Verilog specification. Design portability is an important advantage because 
digital circuit technology changes rapidly. By using a standard language, the designer can 
focus on the functionality of the desired circuit without being overly concerned about the 
details of the technology that will eventually be used for implementation. 

Design entry of a logic circuit is done by writing Verilog code. Signals in the circuit 
can be represented as variables in the source code, and logic functions are expressed by 
assigning values to these variables. Verilog source code is plain text, which makes it easy 
for the designer to include within the code documentation that explains how the circuit 
works. This feature, coupled with the fact that Verilog is widely used, encourages sharing 
and reuse of Verilog-described circuits. This allows faster development of new products in 
cases where existing Verilog code can be adapted for use in the design of new circuits. 

Similar to the way in which large circuits are handled in schematic capture, Verilog 
code can be written in a modular way that facilitates hierarchical design. Both small and 
large logic circuit designs can be efficiently represented in Verilog code. 

Verilog design entry can be combined with other methods. For example, a schematic- 
capture tool can be used in which a subcircuit in the schematic is described using Verilog. 
We will introduce Verilog in Section 2.10. 


2.9.2 Locic SYNTHESIS 


Synthesis is the process of generating a logic circuit from an initial specification that may 
be given in the form of a schematic diagram or code written in a hardware description 
language. Synthesis CAD tools generate efficient implementations of circuits from such 
specifications. 

The process of translating, or compiling, Verilog code into a network of logic gates is 
part of synthesis. The output is a set of logic expressions that describe the logic functions 
needed to realize the circuit. 

Regardless of what type of design entry is used, the initial logic expressions produced by 
the synthesis tools are not likely to be in an optimal form because they reflect the designer’s 
input to the CAD tools. It is impossible for a designer to manually produce optimal results 
for large circuits. So, one of the important tasks of the synthesis tools is to manipulate the 
user’s design to automatically generate an equivalent, but better circuit. 

The measure of what makes one circuit better than another depends on the particular 
needs of a design project and the technology chosen for implementation. Earlier in this 
chapter we suggested that a good circuit might be one that has the lowest cost. There are other 
possible optimization goals, which are motivated by the type of hardware technology used 
for implementation of the circuit. We discuss implementation technologies in Appendix B. 

The performance of a synthesized circuit can be assessed by physically constructing 
the circuit and testing it. But, its behavior can also be evaluated by means of simulation. 
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2.9.3 FUNCTIONAL SIMULATION 


A circuit represented in the form of logic expressions can be simulated to verify that it 
will function as expected. The tool that performs this task is called a functional simulator. 
It uses the logic expressions (often referred to as equations) generated during synthesis, 
and assumes that these expressions will be implemented with perfect gates through which 
signals propagate instantaneously. The simulator requires the user to specify valuations 
of the circuit’s inputs that should be applied during simulation. For each valuation, the 
simulator evaluates the outputs produced by the expressions. The results of simulation are 
usually provided in the form of a timing diagram which the user can examine to verify that 
the circuit operates as required. 


2.9.4 PHYSICAL DESIGN 


After logic synthesis the next step in the design flow is to determine exactly how to imple- 
ment the circuit on a given chip. This step is often called physical design. As we discuss in 
Appendix B, there are several different technologies that may be used to implement logic 
circuits. The physical design tools map a circuit specified in the form of logic expressions 
into a realization that makes use of the resources available on the target chip. They deter- 
mine the placement of specific logic elements, which are not necessarily simple gates of 
the type we have encountered so far. They also determine the wiring connections that have 
to be made between these elements to implement the desired circuit. 


2.9.5 TIMING SIMULATION 


Logic gates and other logic elements are implemented with electronic circuits, and these 
circuits cannot perform their function with zero delay. When the values of inputs to the 
circuit change, it takes a certain amount of time before a corresponding change occurs at 
the output. This is called a propagation delay of the circuit. The propagation delay consists 
of two kinds of delays. Each logic element needs some time to generate a valid output 
signal whenever there are changes in the values of its inputs. In addition to this delay, there 
is a delay caused by signals that must propagate along wires that connect various logic 
elements. The combined effect is that real circuits exhibit delays, which has a significant 
impact on their speed of operation. 

A timing simulator evaluates the expected delays of a designed logic circuit. Its results 
can be used to determine if the generated circuit meets the timing requirements of the 
specification for the design. If the requirements are not met, the designer can ask the 
physical design tools to try again by indicating specific timing constraints that have to be 
met. If this does not succeed, then the designer has to try different optimizations in the 
synthesis step, or else improve the initial design that is presented to the synthesis tools. 
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2.9.6 CIRCUIT IMPLEMENTATION 


Having ascertained that the designed circuit meets all requirements of the specification, the 
circuit is implemented on an actual chip. If a custom-manufactured chip is created for this 
design, then this step is called chip fabrication. But if a programmable hardware device 
is used, then this step is called chip configuration or programming. Various types of chip 
technologies are described in Appendix B. 


2.9.7 COMPLETE DESIGN FLow 


The CAD tools discussed above are the essential parts of a CAD system. The complete 
design flow that we discussed is illustrated in Figure 2.35. This has been just a brief 
introductory discussion. A full presentation of the CAD tools is given in Chapter 10. 

At this point the reader should have some appreciation for what is involved when using 
CAD tools. However, the tools can be fully appreciated only when they are used firsthand. 
We strongly encourage the reader to obtain access to suitable CAD tools and implement 
some examples of circuits by using these tools. Two examples of commonly-used CAD 
tools are the Quartus II tools available from Altera Corporation and the ISE tools provided 
by Xilinx Corporation. Both of these CAD systems can be obtained free-of-charge for 
educational use from their respective corporations’ websites. 


2.10 INTRODUCTION TO VERILOG 


In the 1980s rapid advances in integrated circuit technology lead to efforts to develop 
standard design practices for digital circuits. Verilog was produced as a part of that effort. 
The original version of Verilog was developed by Gateway Design Automation, which was 
later acquired by Cadence Design Systems. In 1990 Verilog was put into the public domain, 
and it has since become one of the most popular languages for describing digital circuits. 
In 1995 Verilog was adopted as an official IEEE Standard, called 1364-1995. An enhanced 
version of Verilog, called Verilog 2001, was adopted as IEEE Standard 1364-2001 in 2001. 
While this version introduced a number of new features, it also supports all of the features 
in the original Verilog standard. 

Verilog was originally intended for simulation and verification of digital circuits. Sub- 
sequently, with the addition of synthesis capability, Verilog has also become popular for 
use in design entry in CAD systems. The CAD tools are used to synthesize the Verilog 
code into a hardware implementation of the described circuit. In this book our main use of 
Verilog will be for synthesis. 

Verilog is a complex, sophisticated language. Learning all of its features is a daunting 
task. However, for use in synthesis only a subset of these features is important. To simplify 
the presentation, we will focus the discussion on the features of the Verilog language that 
are actually used in the examples in the book. The material presented is sufficient to allow 
the reader to design a wide range of circuits. The reader who wishes to learn the complete 
Verilog language can refer to one of the specialized texts [5—11]. 
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Design conception 


Figure 2.35 


Functional simulation 


Timing requirements met? 


Yes 
Chip configuration 


A typical CAD system. 
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Verilog is introduced in several stages throughout the book. Our general approach will 
be to introduce particular features only when they are relevant to the design topics covered 
in that part of the text. In Appendix A we provide a concise summary of the Verilog features 
covered in the book. The reader will find it convenient to refer to that material from time to 
time. In the remainder of this chapter we discuss the most basic concepts needed to write 
simple Verilog code. 


Representation of Digital Circuits in Verilog 

When using CAD tools to synthesize a logic circuit, the designer can provide the initial 
description of the circuit in several different ways, as we explained in the previous section. 
One efficient way is to write this description in the form of Verilog source code. The Verilog 
compiler translates this code into a logic circuit. 

Verilog allows the designer to describe a desired circuit in a number of ways. One 
possibility is to use Verilog constructs that describe the structure of the circuit in terms 
of circuit elements, such as logic gates. A larger circuit is defined by writing code that 
connects such elements together. This approach is referred to as the structural representation 
of logic circuits. Another possibility is to describe a circuit more abstractly, by using 
logic expressions and Verilog programming constructs that define the desired behavior of 
the circuit, but not its actual structure in terms of gates. This is called the behavioral 
representation. 


2.10.1 STRUCTURAL SPECIFICATION OF LOGIC CIRCUITS 


Verilog includes a set of gate-level primitives that correspond to commonly-used logic gates. 
A gate is represented by indicating its functional name, output, and inputs. For example, a 
two-input AND gate, with output y and inputs x; and x2, is denoted as 


and (y, x1, x2); 
A four-input OR gate is specified as 
or (y, x1, x2, x3, x4); 


The keywords nand and nor are used to define the NAND and NOR gates in the same way. 
The NOT gate given by 


not (y, x); 


implements y = x. The gate-level primitives can be used to specify larger circuits. All of 
the available Verilog gate-level primitives are listed in Table A.2 in Appendix A. 

A logic circuit is specified in the form of a module that contains the statements that 
define the circuit. A module has inputs and outputs, which are referred to as its ports. 
The word port is a commonly-used term that refers to an input or output connection to an 
electronic circuit. Consider the multiplexer circuit from Figure 2.33b, which is reproduced 
in Figure 2.36. This circuit can be represented by the Verilog code in Figure 2.37. The 
first statement gives the module a name, example/, and indicates that there are four port 
signals. The next two statements declare that x,, x2, and s are to be treated as input signals, 


2.10 INTRODUCTION TO VERILOG 


xy 


> 


Figure 2.36 = The logic circuit for a multiplexer. 


module example! (x1, x2, s, f); 
input x1, x2, s; 
output f; 


not (k, s); 
and (g, k, x1); 
and (h, s, x2); 
or (f, g, h); 


endmodule 


Figure 2.37 Verilog code for the circuit in Figure 2.36. 


while fis the output. The actual structure of the circuit is specified in the four statements 
that follow. The NOT gate gives k = s. The AND gates produce g = sx, and h = sx2. The 
outputs of AND gates are combined in the OR gate to form 


f=gth 
= SX] + $X2 


The module ends with the endmodule statement. We have written the Verilog keywords 
in bold type to make the text easier to read. We will continue this practice throughout the 
book. 

A second example of Verilog code is given in Figure 2.38. It defines a circuit that has 
four input signals, x), x2, x3, and x4, and three output signals, f, g, and h. It implements the 
logic functions 


& = X1X3 + X2X4 
h= (x, +X3)(%2 + x4) 
f=gth 


71 


72 


CHAPTER 2 . INTRODUCTION TO LOGIC CIRCUITS 


module example? (x1, x2, x3, x4, f, g, h); 
input x1, x2, x3, x4; 
output f, g, h; 


and (zl, x1, x3); 
and (z2, x2, x4); 
or (g, z1, 22); 

or (z3, xl, ~x3); 
or (z4, ~x2, x4); 
and (h, z3, 24); 
or (f, g, h); 


endmodule 


Figure 2.38 Verilog code for a four-input circuit. 


Instead of using explicit NOT gates to define x2 and x3, we have used the Verilog operator 
“~” (tilde character on the keyboard) to denote complementation. Thus, x2 is indicated as 
~x2 in the code. The circuit produced by the Verilog compiler for this example is shown 
in Figure 2.39. 


Verilog Syntax 

The names of modules and signals in Verilog code follow two simple rules: the name 
must start with a letter, and it can contain any letter or number plus the “_” underscore 
and “$” characters. Verilog is case sensitive. Thus, the name k is not the same as K and 
Example! is not the same as examplel. The Verilog syntax does not enforce a particular 
style of code. For example, multiple statements can appear on a single line. White space 
characters, such as SPACE and TAB, and blank lines are ignored. As a matter of good 
style, code should be formatted in such a way that it is easy to read. Indentation and blank 
lines can be used to make separate parts of the code easily recognizable, as we have done in 
Figures 2.37 and 2.38. Comments may be included in the code to improve its readability. 
A comment begins with the double slash “//” and continues to the end of the line. 


2.10.2 BEHAVIORAL SPECIFICATION OF LOGIC CIRCUITS 


Using gate-level primitives can be tedious when large circuits have to be designed. An 
alternative is to use more abstract expressions and programming constructs to describe the 
behavior of a logic circuit. One possibility is to define the circuit using logic expressions. 
Figure 2.40 shows how the circuit in Figure 2.36 can be defined with the expression 


f = 5x, + 5x2 


The AND and OR operations are indicated by the “&” and “|” Verilog operators, respectively. 
The assign keyword provides a continuous assignment for the signal f . The word continuous 
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nal 
x3 
&§ 
a) 
x4 
f 
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Figure 2.39 Logic circuit for the code in Figure 2.38. 


module example3 (x1, x2, s, f); 
input x1, x2, s; 
output f; 
assign f=(~s & x1)| (s & x2); 


endmodule 


Figure 2.40 —_ Using the continuous assignment to specify the 
circuit in Figure 2.36. 


stems from the use of Verilog for simulation; whenever any signal on the right-hand side 
changes its state, the value of f will be re-evaluated. The effect is equivalent to using the 
gate-level primitives in Figure 2.37. Following this approach, the circuit in Figure 2.39 can 
be specified as shown in Figure 2.41. 

Using logic expressions makes it easier to write Verilog code. But even higher levels 
of abstraction can often be used to advantage. Consider again the multiplexer circuit of 
Figure 2.36. The circuit can be described in words by saying that’ = x; ifs = Oandf = x2 
if s = 1. In Verilog, this behavior can be defined with the if-else statement 


if (s == 0) 
f=xl; 
else 


f=x2; 
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module example4 (x1, x2, x3, x4, f, g, h); 
input x1, x2, x3, x4; 
output f, g, h; 


assign g = (xl & x3) | (x2 & x4); 
assign h = (xl | ~x3) & (~x2 | x4); 
assign f=g |h; 


endmodule 


Figure 2.41 Using the continuous assignment to specify the 
circuit in Figure 2.39. 


// Behavioral specification 
module examples (x1, x2, s, f); 
input x1, x2, s; 
output f; 
reg f; 


always @(x1 or x2 ors) 


if (s == 0) 
f=xl; 
else 
f=x2; 
endmodule 


Figure 2.42 — Behavioral specification of the circuit in 
Figure 2.36. 


The complete code is given in Figure 2.42. The first line illustrates how a comment can be 
inserted. The if-else statement is an example of a Verilog procedural statement. We will 
introduce other procedural statements, such as loop statements, in Chapters 3 and 4. 

Verilog syntax requires that procedural statements be contained inside a construct called 
an always block, as shown in Figure 2.42. An always block can contain a single statement, 
as in this example, or it can contain multiple statements. A typical Verilog design module 
may include several always blocks, each representing a part of the circuit being modeled. 
An important property of the always block is that the statements it contains are evaluated 
in the order given in the code. This is in contrast to the continuous assignment statements, 
which are evaluated concurrently and hence have no meaningful order. 

The part of the always block after the @ symbol, in parentheses, is called the sensitivity 
list. This list has its roots in the use of Verilog for simulation. The statements inside 
an always block are executed by the simulator only when one or more of the signals in 
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the sensitivity list changes value. In this way, the complexity of a simulation process is 
simplified, because it is not necessary to execute every statement in the code at all times. 
When Verilog is being employed for synthesis of circuits, as in this book, the sensitivity list 
simply tells the Verilog compiler which signals can directly affect the outputs produced by 
the always block. 

If a signal is assigned a value using procedural statements, then Verilog syntax requires 
that it be declared as a variable; this is accomplished by using the keyword reg in Figure 2.42. 
This term also derives from the simulation jargon: It means that, once a variable’s value 
is assigned with a procedural statement, the simulator “registers” this value and it will not 
change until the always block is executed again. We will discuss this issue in detail in 
Chapter 3. 

Instead of using a separate statement to declare that the variable f is of reg type in 
Figure 2.42, we can alternatively use the syntax 


output reg f; 
which combines these two statements. Also, Verilog 2001 adds the ability to declare a 
signal’s direction and type directly in the module’s list of ports. This style of code is 
illustrated in Figure 2.43. In the sensitivity list of the always statement we can use commas 
instead of the word or, which is also illustrated in Figure 2.43. Moreover, instead of listing 
the relevant signals in the sensitivity list, it is possible to write simply 

always @ (x) 
or even more simply 

always @x« 
assuming that the compiler will figure out which signals need to be considered. 
Behavioral specification of a logic circuit defines only its behavior. CAD synthesis 


tools use this specification to construct the actual circuit. The detailed structure of the 
synthesized circuit will depend on the technology used. 


// Behavioral specification 
module example5 (input x1, x2, s, output reg f); 


always @(x1, x2, s) 


if (s == 0) 
f=xl; 
else 
f=x2; 
endmodule 


Figure 2.43 = A more compact version of the code in Figure 2.42. 


75 


76 


CHAPTER 2 ° INTRODUCTION TO LOGIC CIRCUITS 


2.10.3 HIERARCHICAL VERILOG CODE 


The examples of Verilog code given so far include just a single module. For larger designs, 
it is often convenient to create a hierarchical structure in the Verilog code, in which there 
is a top-level module that includes multiple instances of lower-level modules. To see how 
hierarchical Verilog code can be written consider the circuit in Figure 2.44. This circuit 
comprises two lower-level modules: the adder module that we described in Figure 2.12, and 
the module that drives a 7-segment display which we showed in Figure 2.34. The purpose 
of the circuit is to generate the arithmetic sum of the two inputs x and y, using the adder 
module, and then to show the resulting decimal value on the 7-segment display. 

Verilog code for the adder module from Figure 2.12 and the display module from 
Figure 2.34 is given in Figures 2.45 and 2.46, respectively. For the adder module con- 
tinuous assignment statements are used to specify the two-bit sum s;s9. The assignment 
statement for so uses the Verilog XOR operator, which is specified as so = a“ b. The code 
for the display module includes continuous assignment statements that correspond to the 


Top-level module 


Adder module Display module 


Figure 2.44 =A logic circuit with two modules. 


// An adder module 

module adder (a, b, sl, sO); 
input a, b; 
output s1, sO; 


assign sl =a & b; 
assign sO =a “ b; 


endmodule 


Figure 2.45 _ Verilog specification of the circuit in 
Figure 2.12. 
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// Amodule for driving a 7-segment display 
module display (s1, s0, a, b, c, d, e, f, g); 
input s1, sO; 
output a, b, c, d, e, f, g; 


assign a = ~s0; 
assign b = 1; 

assign c = ~s1; 
assign d =~ s0; 
assign e = ~s0; 
assign f = ~sl & ~s0; 
assign g = sl & ~s0; 


endmodule 


Figure 2.46 _ Verilog specification of the circuit in Figure 2.34. 


module adder_display (x, y, a, b, c, d, e, f, g); 
input x, y; 
output a, b, c, d, e, f, g; 
wire wl, w0; 


adder U1 (x, y, wl, w0); 
display U2 (wl, w0, a, b, c, d, e, f, g); 


endmodule 


Figure 2.47 — Hierarchical Verilog code for the circuit in 
Figure 2.44. 


logic expressions for each of the seven outputs of the display circuit, which are given in 
Section 2.8.3. The statement 
assign b = 1; 


assigns the output b of the display module to have the constant value 1. We discuss the 
specification of numbers in Verilog code in Chapter 3. 

The top-level Verilog module, named adder_display, is given in Figure 2.47. This 
module has the inputs x and y, and the outputs a,..., g. The statement 


wire wl, w0; 


is needed because the signals w, and wo are neither inputs nor outputs of the circuit in 
Figure 2.44. Since these signals cannot be declared as input or output ports in the Verilog 
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code, they have to be declared as (internal) wires. The statement 
adder U1 (x, y, wl, w0); 


instantiates the adder module from Figure 2.45 as a submodule. The submodule is given 
a name, U1, which can be any valid Verilog name. In this instantiation statement the 
signals attached to the ports of the adder submodule are listed in the same order as those 
in Figure 2.45. Thus, the input ports x and y of the top-level module in Figure 2.47 are 
connected to the first two ports of adder, which are named a and b. The order in which 
signals are listed in the instantiation statement determines which signal is connected to each 
port in the submodule. The instantiation statement also attaches the last two ports of the 
adder submodule, which are its outputs, to the wires wl! and w0 in the top-level module. 
The statement 


display U2 (wl, w0, a, b, c, d, e, f, g); 


instantiates the other submodule in our circuit. Here, the wires wl and w0, which have 
already been connected to the outputs of the adder submodule, are attached to the corre- 
sponding input ports of the display submodule. The display submodule’s output ports are 
attached to the a, ..., g output ports of the top-level module. 


2.10.4 How Nort TO WRITE VERILOG CODE 


When learning how to use Verilog or other hardware description languages, the tendency for 
the novice is to write code that resembles a computer program, containing many variables 
and loops. It is difficult to determine what logic circuit the CAD tools will produce when 
synthesizing such code. This book contains more than 100 examples of complete Verilog 
code that represent a wide range of logic circuits. In these examples the code is easily 
related to the described logic circuit. The reader is advised to adopt the same style of code. 
A good general guideline is to assume that if the designer cannot readily determine what 
logic circuit is described by the Verilog code, then the CAD tools are not likely to synthesize 
the circuit that the designer is trying to model. 

Once complete Verilog code is written for a particular design, the reader is encouraged 
to analyze the resulting circuit produced by the CAD synthesis tools; typical CAD systems 
provide graphical viewing tools that can display a logic circuit that corresponds to the output 
produced by the Verilog compiler. Much can be learned about Verilog, logic circuits, and 
logic synthesis through this process. We provide additional guidelines for writing Verilog 
code in Appendix A. 


2.11 MINIMIZATION AND KARNAUGH MAPS 


In a number of our examples we have used algebraic manipulation to find a reduced-cost 
implementation of a function in either sum-of-products or product-of-sums form. In these 
examples, we made use of the rules, theorems, and properties of Boolean algebra that 
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Row 
number | x; Xx, 3 f 
0 0 Oo 0O 1 
1 0 oO 1 0 
2 Oo 1 0 1 
3 Oo 1 1 0 
4 1 0 0O 1 
5 1 oO 1 1 
6 1 1 0O 1 
7 1 1 1 0 


Figure 2.48 The function f (x1, x2, x3) = ) m(0, 2, 4, 5, 6). 


were introduced in Section 2.5. For example, we often used the distributive property, 
DeMorgan’s theorem, and the combining property. In general, it is not obvious when 
to apply these theorems and properties to find a minimum-cost circuit, and it is often 
tedious and impractical to do so. This section introduces a more manageable approach, call 
the Karnaugh map, which provides a systematic way of producing a minimum-cost logic 
expression. 

The key to the Karnaugh map approach is that it allows the application of the combining 
property 14a, or 145, as judiciously as possible. To understand how it works consider the 
function f in Figure 2.48. The canonical sum-of-products expression for f consists of 
minterms mo, m2, m4, m5, and me, so that 


f =X X2X3 + Xp X2X3 + XyX2X3z + XX2Xz + Xj X2X3 


The combining property 14a allows us to replace two minterms that differ in the value of 
only one variable with a single product term that does not include that variable at all. For 
example, both mo and mz include x; and x3, but they differ in the value of x. because mo 
includes X2 while m2 includes x2. Thus 


X1X2X3 + Xj X2X3 = X1(%2 + X2)H3 


=x,-1-%3 


Hence mo and mp can be replaced by the single product term X,x3. Similarly, m4 and m6 
differ only in the value of x2 and can be combined using 


XXQXz + XyxX2X3 = Xy (XQ + X2)x3 
=X: 1- X3 


= X1X3 
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Now the two newly-generated terms, x3 and x,x3, can be combined further as 


XyX3 + XyX3 = (X + x1) x3 
= 1-3 
These optimization steps indicate that we can replace the four minterms mo, m2, m4, and 
mo With the single product term x3. In other words, the minterms mo, m2, m4, and me are 
all included in the term x3. The remaining minterm in f is ms. It can be combined with ma, 
which gives 


X1X2X3 + X1X2X3 = X1X2 
Recall that theorem 7b in Section 2.5 indicates that 
m4 = m4 +4 


which means that we can use the minterm m4 twice—to combine with minterms mo, m2, 
and me to yield the term x3 as explained above and also to combine with ms to yield the 
term x|X>. 

We have now accounted for all the minterms in f; hence all five input valuations for 
which f = 1 are covered by the minimum-cost expression 


f =%34+x1X2 


The expression has the product term x3 because f = | when x3 = 0 regardless of the values 
of x; and x2. The four minterms mo, m2, m4, and mg represent all possible minterms for 
which x3 = 0; they include all four valuations, 00, 01, 10, and 11, of variables x; and x». 
Thus if x3 = 0, then it is guaranteed that f = 1. This may not be easy to see directly from 
the truth table in Figure 2.48, but it is obvious if we write the corresponding valuations 
grouped together: 


xX] x2 x3 
mo 0 0 0 
m2 0 (| 0 
m4 | 0 0 
™6 1 1 0 


In a similar way, if we look at m4 and ms as a group of two 


x1 x2 X3 
m4 1 0 
ms 1 


it is clear that when x; = | and x2 = 0, then f = 1 regardless of the value of x3. 
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0 O Mg 
0 1 my 
1 0 My 
1 1 m, 
(a) Truth table (b) Karnaugh map 


Figure 2.49 _ Location of two-variable minterms. 


The preceding discussion suggests that it would be advantageous to devise a method 
that allows easy discovery of groups of minterms for which f = 1 that can be combined 
into single terms. The Karnaugh map is a useful vehicle for this purpose. 

The Karnaugh map [1] is an alternative to the truth-table form for representing a 
function. The map consists of cells that correspond to the rows of the truth table. Consider 
the two-variable example in Figure 2.49. Part (a) depicts the truth-table form, where each 
of the four rows is identified by a minterm. Part (b) shows the Karnaugh map, which has 
four cells. The columns of the map are labeled by the value of x, and the rows are labeled 
by x2. This labeling leads to the locations of minterms as shown in the figure. Compared 
to the truth table, the advantage of the Karnaugh map is that it allows easy recognition of 
minterms that can be combined using property 14a from Section 2.5. Minterms in any two 
cells that are adjacent, either in the same row or the same column, can be combined. For 
example, the minterms m, and m3 can be combined as 


mz + m3 = X\X2 + X1X2 


= X1(X%2 + X2) 
=X, -1 
=X 


The Karnaugh map is not just useful for combining pairs of minterms. As we will see in 
several larger examples, the Karnaugh map can be used directly to derive a minimum-cost 
circuit for a logic function. 


Two-Variable Map 


A Karnaugh map for a two-variable function is given in Figure 2.50. It corresponds to 
the function f of Figure 2.19. The value of f for each valuation of the variables x; and x2 
is indicated in the corresponding cell of the map. Because a | appears in both cells of the 
bottom row and these cells are adjacent, there exists a single product term that can cause 
f to be equal to | when the input variables have the values that correspond to either of 
these cells. To indicate this fact, we have circled the cell entries in the map. Rather than 
using the combining property formally, we can derive the product term intuitively. Both of 
the cells are identified by x. = 1, but x, = 0 for the left cell and x, = | for the right cell. 
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Figure 2.50 The function of Figure 2.19. 


Thus if x. = 1, then f = | regardless of whether x, is equal to 0 or 1. The product term 
representing the two cells is simply x2. 

Similarly, f = 1 for both cells in the first column. These cells are identified by x; = 0. 
Therefore, they lead to the product term x;. Since this takes care of all instances where 
f = 1, it follows that the minimum-cost realization of the function is 


f =x2+% 


Evidently, to find a minimum-cost implementation of a given function, it is necessary 
to find the smallest number of product terms that produce a value of | for all cases where 
f = 1. Moreover, the cost of these product terms should be as low as possible. Note that a 
product term that covers two adjacent cells is cheaper to implement than a term that covers 
only a single cell. For our example once the two cells in the bottom row have been covered 
by the product term x2, only one cell (top left) remains. Although it could be covered by 
the term X)X, it is better to combine the two cells in the left column to produce the product 
term x, because this term is cheaper to implement. 


Three-Variable Map 


A three-variable Karnaugh map is constructed by placing 2 two-variable maps side 
by side. Figure 2.51a lists all of the three-variable minterms, and part (b) of the figure 
indicates the locations of these minterms in the Karnaugh map. In this case each valuation 
of x; and x2 identifies a column in the map, while the value of x3 distinguishes the two 
rows. To ensure that minterms in the adjacent cells in the map can always be combined 
into a single product term, the adjacent cells must differ in the value of only one variable. 
Thus the columns are identified by the sequence of (x;, x2) values of 00, 01, 11, and 10, 
rather than the more obvious 00, 01, 10, and 11. This makes the second and third columns 
different only in variable x,. Also, the first and the fourth columns differ only in variable 
x,, which means that these columns can be considered as being adjacent. The reader may 
find it useful to visualize the map as a rectangle folded into a cylinder where the left and the 
right edges in Figure 2.51b are made to touch. (A sequence of codes, or valuations, where 
consecutive codes differ in one variable only is known as the Gray code. This code is used 
for a variety of purposes, some of which will be encountered later in the book.) 

Figure 2.52a represents the function of Figure 2.23 in Karnaugh-map form. To synthe- 
size this function, it is necessary to cover the four Is in the map as efficiently as possible. 
It is not difficult to see that two product terms suffice. The first covers the Is in the top row, 
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Figure 2.52 — Examples of three-variable Karnaugh maps. 


which are represented by the term x;x3. The second term is X2x3, which covers the Is in 
the bottom row. Hence the function is implemented as 


f = X1%3 + X2x3 


which describes the circuit obtained in Figure 2.24a. 
In a three-variable map it is possible to combine cells to produce product terms that 
correspond to a single cell, two adjacent cells, or a group of four adjacent cells. Realization 
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of a group of four adjacent cells using a single product term is illustrated in Figure 2.52b, 
using the function from Figure 2.48. The four cells in the top row correspond to the 
(x1, X2, x3) valuations 000, 010, 110, and 100. As we discussed before, this indicates that if 
x3 = 0, then f = 1 for all four possible valuations of x; and x2, which means that the only 
requirement is that x; = 0. Therefore, the product term x3 represents these four cells. The 
remaining 1, corresponding to minterm ms, is best covered by the term x;x2, obtained by 
combining the two cells in the right-most column. The complete realization of f is 


f =X34+-x1X2 


It is also possible to have a group of eight 1s in a three-variable map. This is the trivial case 
of a function where f = 1 for all valuations of input variables; in other words, f is equal to 
the constant 1. 

The Karnaugh map provides a simple mechanism for generating the product terms that 
should be used to implement a given function. A product term must include only those 
variables that have the same value for all cells in the group represented by this term. If the 
variable is equal to | in the group, it appears uncomplemented in the product term; if it is 
equal to 0, it appears complemented. Each variable that is sometimes | and sometimes 0 
in the group does not appear in the product term. 


Four-Variable Map 

A four-variable map is constructed by placing 2 three-variable maps together to create 
four rows in the same fashion as we used 2 two-variable maps to form the four columns 
in a three-variable map. Figure 2.53 shows the structure of the four-variable map and 
the location of minterms. We have included in this figure another frequently used way of 
designating the rows and columns. As shown in blue, it is sufficient to indicate the rows 
and columns for which a given variable is equal to 1. Thus x; = | for the two right-most 
columns, x» = | for the two middle columns, x3 = 1 for the bottom two rows, and x4 = 1 
for the two middle rows. 

Figure 2.54 gives four examples of four-variable functions. The function /, has a group 
of four 1s in adjacent cells in the bottom two rows, for which x2 = 0 and x3 = 1—they are 


Figure 2.53 =A four-variable Karnaugh map. 
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Figure 2.54 — Examples of four-variable Karnaugh maps. 


represented by the product term X23. This leaves the two Is in the second row to be covered, 
which can be accomplished with the term x34. Hence the minimum-cost implementation 
of the function is 


Si = X2x3 + X34 


The function f> includes a group of eight 1s that can be implemented by a single term, x3. 
Again, the reader should note that if the remaining two 1s were implemented as a group 
of two, the result would be the product term x;%3x4. Implementing these Is as a part of a 
group of four 1s, as shown in the figure, gives the less expensive product term x14. 

Just as the left and the right edges of the map are adjacent in terms of the assignment 
of the variables, so are the top and the bottom edges. Indeed, the four corners of the map 
are adjacent to each other and thus can form a group of four 1s, which may be implemented 
by the product term X2x4. This case is depicted by the function /;. In addition to this group 


85 


86 


CHAPTER 2 ° INTRODUCTION TO LOGIC CIRCUITS 


of 1s, there are four other 1s that must be covered to implement f;. This can be done as 
shown in the figure. 

In all examples that we have considered so far, a unique solution exists that leads to 
a minimum-cost circuit. The function f4 provides an example where there is some choice. 
The groups of four Is in the top-left and bottom-right corners of the map are realized by the 
terms x x3 and x;x3, respectively. This leaves the two Is that correspond to the term x1x2%3. 
But these two Is can be realized more economically by treating them as a part of a group 
of four 1s. They can be included in two different groups of four, as shown in the figure. 
One choice leads to the product term x;x2, and the other leads to x2x3. Both of these terms 
have the same cost; hence it does not matter which one is chosen in the final circuit. Note 
that the complement of x3 in the term x2x3 does not imply an increased cost in comparison 
with x,x2, because this complement must be generated anyway to produce the term X,x3, 
which is included in the implementation. 


Five- Variable Map 

We can use 2 four-variable maps to construct a five-variable map. It is easy to imagine 
a structure where one map is directly behind the other, and they are distinguished by x5 = 0 
for one map and xs = | for the other map. Since such a structure is awkward to draw, we 
can simply place the two maps side by side as shown in Figure 2.55. For the logic function 
given in this example, two groups of four Is appear in the same place in both four-variable 
maps; hence their realization does not depend on the value of x5. The same is true for the 
two groups of two Is in the second row. The | in the top-right corner appears only in the 
right map, where x5 = |; it is a part of the group of two Is realized by the term x1 X2x3x5. 
Note that in this map we left blank those cells for which f = 0, to make the figure more 
readable. We will do likewise in a number of maps that follow. 

Using a five-variable map is obviously more awkward than using maps with fewer 
variables. Extending the Karnaugh map concept to more variables is not useful from 
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fi = Xp Xz +X, XZXq + XX X3ZX5 


Figure 2.55 _—_A five-variable Karnaugh map. 


2.12 STRATEGY FOR MINIMIZATION 


the practical point of view. This is not troublesome, because practical synthesis of logic 
functions is done with CAD tools that perform the necessary minimization automatically. 
Although Karnaugh maps are occasionally useful for designing small logic circuits, our main 
reason for introducing the Karnaugh maps is to provide a simple vehicle for illustrating the 
ideas involved in the minimization process. 
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For the examples in the preceding section, we used an intuitive approach to decide how the Is 
in a Karnaugh map should be grouped together to obtain the minimum-cost implementation 
of a given function. Our intuitive strategy was to find as few as possible and as large as 
possible groups of Is that cover all cases where the function has a value of 1. Each group 
of Is has to comprise cells that can be represented by a single product term. The larger 
the group of Is, the fewer the number of variables in the corresponding product term. This 
approach worked well because the Karnaugh maps in our examples were small. For larger 
logic functions, which have many variables, such intuitive approach is unsuitable. Instead, 
we must have an organized method for deriving a minimum-cost implementation. In this 
section we will introduce a possible method, which is similar to the techniques that are 
automated in CAD tools. To illustrate the main ideas, we will use Karnaugh maps. 


2.12.1 TERMINOLOGY 


A huge amount of research work has gone into the development of techniques for synthesis 
of logic functions. The results of this research have been published in numerous papers. 
To facilitate the presentation of the results, certain terminology has evolved that avoids 
the need for using highly descriptive phrases. We define some of this terminology in the 
following paragraphs because it is useful for describing the minimization process. 


Literal 

A given product term consists of some number of variables, each of which may appear 
either in uncomplemented or complemented form. Each appearance of a variable, either 
uncomplemented or complemented, is called a literal. For example, the product term x) x2x3 
has three literals, and the term X).x3%4%6 has four literals. 


Implicant 

A product term that indicates the input valuation(s) for which a given function is equal 
to 1 is called an implicant of the function. The most basic implicants are the minterms, 
which we introduced in Section 2.6.1. For an n-variable function, a minterm is an implicant 
that consists of n literals. 

Consider the three-variable function in Figure 2.56. There are 11 implicants for this 
function. This includes the five minterms: %1%2%3, X)X2x3, X1x2X3, XyxX2X3, and xx2x3. 
Then there are the implicants that correspond to all possible pairs of minterms that can be 
combined, namely, x1X2 (mo and m,), X1X3 (mo and mz), X;x3 (m, and m3), X;xX2 (mz and m3), 
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Figure 2.56 Three-variable function f (x1, x2, x3) = 
yY m0, 1, 2, 3,7). 


and x2x3 (m3 and m7). Finally, there is one implicant that covers a group of four minterms, 
which consists of a single literal x). 


Prime Implicant 

An implicant is called a prime implicant if it cannot be combined into another implicant 
that has fewer literals. Another way of stating this definition is to say that it is impossible 
to delete any literal in a prime implicant and still have a valid implicant. 

In Figure 2.56 there are two prime implicants: x; and x2x3. It is not possible to delete 
a literal in either of them. Doing so for x, would make it disappear. For x2x3, deleting 
a literal would leave either x2 or x3. But x2 is not an implicant because it includes the 
valuation (x), x2, x3) = 110 for which f = 0, and x3 is not an implicant because it includes 
(x1, X2, x3) = 101 for which f = 0. Another way of thinking about prime implicants is that 
they represent “the largest groups of Is” that can be circled in the Karnaugh map. 


Cover 

A collection of implicants that account for all valuations for which a given function is 
equal to | is called a cover of that function. A number of different covers exist for most 
functions. Obviously, a set of all minterms for which f = | is a cover. It is also apparent 
that a set of all prime implicants is a cover. 

A cover defines a particular implementation of the function. In Figure 2.56 a cover 
consisting of minterms leads to the expression 


SF =X X2N3 + Xp X2H3 + Ny XQNZ + Xp X2X3 + Xj X2X3 
Another valid cover is given by the expression 
f =X X2 + Xx2 + X2.x3 
The cover comprising the prime implicants is 
f =X + 2x3 


While all of these expressions represent the function f correctly, the cover consisting of 
prime implicants leads to the lowest-cost implementation. 
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Cost 


In Section 2.6.1 we suggested that a good indication of the cost of a logic circuit is the 
number of gates plus the total number of inputs to all gates in the circuit. We will use this 
definition of cost throughout the book. But we will assume that primary inputs, namely, 
the input variables, are available in both true and complemented forms at zero cost. Thus 
the expression 


f = X1X2 + x3%X4 


has a cost of nine because it can be implemented using two AND gates and one OR gate, 
with six inputs to the AND and OR gates. 

If an inversion is needed inside a circuit, then the corresponding NOT gate and its input 
are included in the cost. For example, the expression 


g = (1X2 + x3) (4 + x5) 


is implemented using two AND gates, two OR gates, and one NOT gate to complement 
(x1|X2 + x3), with nine inputs. Hence the total cost is 14. 


2.12.2 MINIMIZATION PROCEDURE 


We have seen that it is possible to implement a given logic function with various circuits. 
These circuits may have different structures and different costs. When designing a logic 
circuit, there are usually certain criteria that must be met. One such criterion is likely to 
be the cost of the circuit, which we considered in the previous discussion. In general, the 
larger the circuit, the more important the cost issue becomes. In this section we will assume 
that the main objective is to obtain a minimum-cost circuit. 

In the previous subsection we concluded that the lowest-cost implementation is 
achieved when the cover of a given function consists of prime implicants. The ques- 
tion then is how to determine the minimum-cost subset of prime implicants that will cover 
the function. Some prime implicants may have to be included in the cover, while for others 
there may be a choice. If a prime implicant includes a minterm for which f = | that is not 
included in any other prime implicant, then it must be included in the cover and is called 
an essential prime implicant. In the example in Figure 2.56, both prime implicants are 
essential. The term x2x3 is the only prime implicant that covers the minterm m7, and x, 
is the only one that covers the minterms mo, m, and m2. Notice that the minterm m3 is 
covered by both of these prime implicants. The minimum-cost realization of the function 
is 


f =X + x2x%3 


We will now present several examples in which there is a choice as to which prime 
implicants to include in the final cover. Consider the four-variable function in Figure 2.57. 
There are five prime implicants: X13, %2%3, 3X4, X1XoxX4, and x2xX3x4. The essential ones 
(highlighted in blue) are x2.x3 (because of 1711), x3X4 (because of m4), and x2x3x4 (because of 
m3). They must be included in the cover. These three prime implicants cover all minterms 
for which f = 1 except m7. It is clear that m7 can be covered by either Xx3 or X1x2Xx4. 
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X1XpXq 
XyXZXq 
X3X4 
*1%3 39X53 
Figure 2.57 — Four-variable function f(x1,..., x4) = 


¥- m(2, 3, 5, 6, 7, 10, 11, 13, 14). 


Because X,x3 has a lower cost, it is chosen for the cover. Therefore, the minimum-cost 
realization is 


f = X2x3 + x3X4 + X2N3X4 + X13 


From the preceding discussion, the process of finding a minimum-cost circuit involves 
the following steps: 


Generate all prime implicants for the given function f. 
Find the set of essential prime implicants. 


3. Ifthe set of essential prime implicants covers all valuations for which f = 1, then 
this set is the desired cover of f. Otherwise, determine the nonessential prime 
implicants that should be added to form a complete minimum-cost cover. 


The choice of nonessential prime implicants to be included in the cover is governed by the 
cost considerations. This choice is often not obvious. Indeed, for large functions there may 
exist many possibilities, and some heuristic approach (i.e., an approach that considers only 
a subset of possibilities but gives good results most of the time) has to be used. One such 
approach is to arbitrarily select one nonessential prime implicant and include it in the cover 
and then determine the rest of the cover. Next, another cover is determined assuming that 
this prime implicant is not in the cover. The costs of the resulting covers are compared, and 
the less-expensive cover is chosen for implementation. 

We can illustrate the process by using the function in Figure 2.58. Of the six prime 
implicants, only x3x4 is essential. Consider next x,x2x3 and assume first that it will be 
included in the cover. Then the remaining three minterms, 779, 11, and ms, will require 
two more prime implicants to be included in the cover. A possible implementation is 


Sf = 3X4 + x1 X2H3 + Hy H3X4 + XY X2X3 


The second possibility is that x,x2x3 is not included in the cover. Then x;x2x4 becomes 
essential because there is no other way of covering m3. Because x}x2x4 also covers m5, 
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Figure 2.58 The function f(x,...,x4) = 
>> m(0, 4,8, 10, 11, 12, 13, 15). 


only myo and m,,; remain to be covered, which can be achieved with x,x2x3. Therefore, the 
alternative implementation is 


f = X34 + Xp X2x4 + X1%2X3 


Clearly, this implementation is a better choice. 

Sometimes there may not be any essential prime implicants at all. An example is given 
in Figure 2.59. Choosing any of the prime implicants and first including it, then excluding 
it from the cover leads to two alternatives of equal cost. One includes the prime implicants 
indicated in black, which yields 


Sf = X1X3Xq + X2X3X4 + X1XZX4 + XINZX4 
The other includes the prime implicants indicated in blue, which yields 
fo =X 1X2X4 + X1xX2H3 + XXQHX4 + X1X2X3 


This procedure can be used to find minimum-cost implementations of both small and 
large logic functions. For our small examples it was convenient to use Karnaugh maps 
to determine the prime implicants of a function and then choose the final cover. Other 
techniques based on the same principles are much more suitable for use in CAD tools; we 
will introduce such techniques in Chapter 8. 

The previous examples have been based on the sum-of-products form. We will next 
illustrate that the same concepts apply for the product-of-sums form. 
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Now that we know how to find the minimum-cost sum-of-products (SOP) implementations 
of functions, we can use the same techniques and the principle of duality to obtain minimum- 
cost product-of-sums (POS) implementations. In this case it is the maxterms for which 
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Figure 2.60 = POS minimization of f(x, x, x3) = TIM (4, 5, 6). 


f = 0 that have to be combined into sum terms that are as large as possible. Again, a sum 
term is considered larger if it covers more maxterms, and the larger the term, the less costly 
it is to implement. 

Figure 2.60 depicts the same function as Figure 2.56 depicts. There are three maxterms 
that must be covered: M4, Ms, and Me. They can be covered by two sum terms shown in 
the figure, leading to the following implementation: 


f = 1 +22) + x3) 


A circuit corresponding to this expression has two OR gates and one AND gate, with two 
inputs for each gate. Its cost is greater than the cost of the equivalent SOP implementation 
derived in Figure 2.56, which requires only one OR gate and one AND gate. 

The function from Figure 2.57 is reproduced in Figure 2.61. The maxterms for which 
f = 0 can be covered as shown, leading to the expression 


Sf = (Xp +.%3) (13 + X4) + XQ +.%3 + X4) 
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Figure 2.61 POS minimization of f(x, ..., x4) = 
TIM (0, 1, 4, 8,9, 12, 15). 


This expression represents a circuit with three OR gates and one AND gate. Two of the 
OR gates have two inputs, and the third has four inputs; the AND gate has three inputs. 
Assuming that both the complemented and uncomplemented versions of the input variables 
x, to x4 are available at no extra cost, the cost of this circuit is 15. This compares favorably 
with the SOP implementation derived from Figure 2.57, which requires five gates and 13 
inputs at a total cost of 18. 

In general, as we already know from Section 2.6.1, the SOP and POS implementations 
of a given function may or may not entail the same cost. The reader is encouraged to find 
the POS implementations for the functions in Figures 2.58 and 2.59 and compare the costs 
with the SOP forms. 

We have shown how to obtain minimum-cost POS implementations by finding the 
largest sum terms that cover all maxterms for which f = 0. Another way of obtaining 
the same result is by finding a minimum-cost SOP implementation of the complement of 
jf. Then we can apply DeMorgan’s theorem to this expression to obtain the simplest POS 


realization because f = f. For example, the simplest SOP implementation of f in Figure 
2.60 is 


f =X) X2 + x1%3 


Complementing this expression using DeMorgan’s theorem yields 


f=f =m +X 


= Xj X2 + X1X3 


= (X + x2) + x3) 


which is the same result as obtained above. 
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Using this approach for the function in Figure 2.61 gives 
f = os + X3Xq + xp x213.x4 


Complementing this expression produces 


SF =f = Xox3 + 3X4 + xp x2X3X4 


= XQNX3 + X3X4 + Xj XIQN3X4 
= (x2 + x3) (13 + 44) (%1 + X2 + X3 + X4) 


which matches the previously-derived implementation. 


2.14 INCOMPLETELY SPECIFIED FUNCTIONS 


In digital systems it often happens that certain input conditions can never occur. For 
example, suppose that x; and x2 control two interlocked switches such that both switches 
cannot be closed at the same time. Thus the only three possible states of the switches 
are that both switches are open or that one switch is open and the other switch is closed. 
Namely, the input valuations (x;, x2) = 00, 01, and 10 are possible, but 11 is guaranteed 
not to occur. Then we say that (x;, x2) = 11 is adon't-care condition, meaning that a circuit 
with x; and x2 as inputs can be designed by ignoring this condition. A function that has 
don’t-care condition(s) is said to be incompletely specified. 

Don’ t-care conditions, or don 't-cares for short, can be used to advantage in the design 
of logic circuits. Since these input valuations will never occur, the designer may assume that 
the function value for these valuations is either 1 or 0, whichever is more useful in trying 
to find a minimum-cost implementation. Figure 2.62 illustrates this idea. The required 
function has a value of | for minterms m2, m4, ms, mo, and mio. Assuming the above- 
mentioned interlocked switches, the x; and x2 inputs will never be equal to | at the same 
time; hence the minterms 12, m3, m4, and mys can all be used as don’t-cares. The don’t- 
cares are denoted by the letter d in the map. Using the shorthand notation, the function f 
is specified as 


FQ, ..-,%4) = ae 4,5, 6, 10) + D(12, 13, 14, 15) 


where D is the set of don’t-cares. 

Part (a) of the figure indicates the best sum-of-products implementation. To form 
the largest possible groups of Is, thus generating the lowest-cost prime implicants, it is 
necessary to assume that the don’t-cares Dj, D3, and D,4 (corresponding to minterms 
m2, M43, and m4) have the value of 1 while D,; has the value of 0. Then there are only 
two prime implicants, which provide a complete cover of f. The resulting implementation 
is 


f = X2X3 + X3X4 


2.14 INCOMPLETELY SPECIFIED FUNCTIONS 


00 O1 11 10 


X4X3 


X3X4 


(X_ + x3) 


(i 4-5,) 


(b) POS implementation 


Figure 2.62 Two implementations of the function f (x;,...,x4) = 
Y m(2, 4, 5, 6, 10) + D(12, 13, 14, 15). 


Part (b) shows how the best product-of-sums implementation can be obtained. The 
same values are assumed for the don’t cares. The result is 


SF = (x2 +.x3)(%3 + X4) 


The freedom in choosing the value of don’t-cares leads to greatly simplified realizations. If 
we were to naively exclude the don’t-cares from the synthesis of the function, by assuming 
that they always have a value of 0, the resulting SOP expression would be 


f = X1X2X3 + X1x3X4 + X7HZX4 
and the POS expression would be 
f = (2 + 3) 3 + X4)(%1 + X2) 


Both of these expressions have higher costs than the expressions obtained with a more 
appropriate assignment of values to don’t-cares. 

Although don’t-care values can be assigned arbitrarily, an arbitrary assignment may 
not lead to a minimum-cost implementation of a given function. If there are k don’t-cares, 
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then there are 2* possible ways of assigning 0 or 1 values to them. In the Karnaugh map 
we can usually see how best to do this assignment to find the simplest implementation. 

In the example above, we chose the don’t-cares Dj7, D)3, and Dj4 to be equal to 1 and 
Djs equal to 0 for both the SOP and POS implementations. Thus the derived expressions 
represent the same function, which could also be specified as )> m(2, 4, 5, 6, 10, 12, 13, 14). 
Assigning the same values to the don’t-cares for both SOP and POS implementations is not 
always a good choice. Sometimes it may be advantageous to give a particular don’t-care 
the value | for SOP implementation and the value 0 for POS implementation, or vice versa. 
In such cases the optimal SOP and POS expressions will represent different functions, 
but these functions will differ only for the valuations that correspond to these don’t-cares. 
Example 2.26 in Section 2.17 illustrates this possibility. 

Using interlocked switches to illustrate how don’t-care conditions can occur in a real 
system may seem to be somewhat contrived. A more practical example is shown below, 
and in future chapters we will encounter many examples of don’t-cares that occur in the 
course of practical design of digital circuits. 


Example 2.15 


In Section 2.8.3 we designed a logic circuit that displays the decimal value of a two-bit 
number on a seven-segment display. In this example we will design a similar circuit, 
except that the input will be a four-bit number X = x3x2x;xo that represents the decimal 
values 0, 1,..., 9. Using four bits to represent decimal digits in this way is often referred 
to as the binary coded decimal (BCD) representation. We discuss BCD numbers in detail in 
Chapter 3. The circuit is depicted in Figure 2.63a. Part (b) of the figure gives a truth table 
for each of the seven outputs a, b,..., g, that control the display. It also indicates how the 
display should appear for each value of X. Since X is restricted to decimal digits, then the 
values of X from 1010 to 1111 are not used. These entries are omitted from the truth table 
in Figure 2.63b, and can be treated as don’t cares in the design of the circuit. 

To derive logic expressions for the outputs a to g, it is useful to draw Karnaugh maps. 
Figure 2.63c gives Karnaugh maps for the functions a and e. For the function a, the best 
result is obtained by setting all six don’t-care cells in the map to the value of 1, which 
gives a = XoXo + X1 + X2X0 + x3. However, for the function e a better choice is to set only 
two of the don’t-care cells, corresponding to x3x2x;x9 = 1010 and x3x2x1x9 = 1110, to 1. 
The other don’t-care cells should be set to 0 for this function to yield the minimum-cost 
expression € = X2Xo + x) Xo. 


2.15 MULTIPLE-OUTPUT CIRCUITS 


As illustrated in Example 2.15, in practical digital systems it is often necessary to implement 
a number of functions that are part of a larger logic circuit. Instead of implementing each 
of these functions separately, it may be possible to share some of the gates needed in the 
implementation of individual functions. For example, in Figure 2.63 the AND gate that 
produces X2xX» could be shared for use in both functions a and e. 
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(b) Truth table 
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a= XoXq + X1 +X9XQ + X3 e= XoXQ +X1Xo 


Figure 2.63 


(c) The Karnaugh maps for outputs a and e. 


Using don’t-care minterms when displaying BCD numbers. 
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Example 2.16 An example of gate sharing is given in Figure 2.64. Two functions, f; and f, of the same 
variables are to be implemented. The minimum-cost implementations for these functions 
are obtained as shown in parts (a) and (b) of the figure. This results in the expressions 


fi = X1X3 + X 1x3 + XQX3Xq 

fo = X1X3 + 1x3 + X2x3Xq 
The cost of f; is four gates and 10 inputs, for a total of 14. The cost of f5 is the same. Thus 
the total cost is 28 if both functions are implemented by separate circuits. A less-expensive 


realization is possible if the two circuits are combined into a single circuit with two outputs. 
Because the first two product terms are identical in both expressions, the AND gates that 


a ha) 


00 O01 11 10 


X3X4 


(a) Function f; (b) Function f, 


fi 


fy 


(c) Combined circuit for f; and /5 


Figure 2.64 = An example of multiple-output synthesis. 


2.15 MULTIPLE-OUTPUT CIRCUITS 


implement them need not be duplicated. The combined circuit is shown in Figure 2.64c. 
Its cost is six gates and 16 inputs, for a total of 22. 

In this example we reduced the overall cost by finding minimum-cost realizations of f; 
and f2 and then sharing the gates that implement the common product terms. This strategy 
does not necessarily always work the best, as the next example shows. 


99 


Figure 2.65 shows two functions to be implemented by a single circuit. Minimum-cost 
realizations of the individual functions f; and f4 are obtained from parts (a) and (b) of the 
figure. 


Sa = X1X4 + X2%4 +X X2.X3 
fa = X14 + XyXq + XyxQxZX4 


None of the AND gates can be shared, which means that the cost of the combined circuit 
would be six AND gates, two OR gates, and 21 inputs, for a total of 29. 

But several alternative realizations are possible. Instead of deriving the expressions for 
jf; and f4 using only prime implicants, we can look for other implicants that may be shared 
advantageously in the combined realization of the functions. Figure 2.65c shows the best 
choice of implicants, which yields the realization 


Sa = 1 X2Xq +X x2xZX4 +X X4 
Sa = X1X2Xq +X x2x3X4 + X24 


The first two implicants are identical in both expressions. The resulting circuit is given in 
Figure 2.65d. It has the cost of six gates and 17 inputs, for a total of 23. 


Example 2.17 


In Example 2.16 we sought the best SOP implementation for the functions f; and f in 
Figure 2.64. We will now consider the POS implementation of the same functions. The 
minimum-cost POS expressions for f, and jf) are 


Ai = Oy +3) (y+ x2 + x3) (41 +43 + x4) 
fo = (1 +3) (%1 + x2 +3) +3 + x4) 


There are no common sum terms in these expressions that could be shared in the imple- 
mentation. Moreover, from the Karnaugh maps in Figure 2.64, it is apparent that there is 
no sum term (covering the cells where f; = f2 = 0) that can be profitably used in realizing 
both f; and f2. Thus the best choice is to implement each function separately, according to 
the preceding expressions. Each function requires three OR gates, one AND gate, and 11 
inputs. Therefore, the total cost of the circuit that implements both functions is 30. This 
realization is costlier than the SOP realization derived in Example 2.16. 


Example 2.18 
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x fs 


X4 ty 


(d) Combined circuit for f; and fy 


Figure 2.65 = Another example of multiple-output synthesis. 


2.17. EXAMPLES OF SOLVED PROBLEMS 
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Consider now the POS realization of the functions f; and f4 in Figure 2.65. The minimum- 
cost POS expressions for f3 and /4 are 


Sa = (%3 + X4) (X20 + X4)(%1 + x4) + 2X2) 
Sa = (03 + X4) (2 + 4) (H1 + X4) (1 + XQ + X4) 


The first three sum terms are the same in both f; and f4; they can be shared in a combined 
circuit. These terms require three OR gates and six inputs. In addition, one 2-input OR 
gate and one 4-input AND gate are needed for /;, and one 3-input OR gate and one 4-input 
AND gate are needed for f4. Thus the combined circuit comprises five OR gates, two AND 
gates, and 19 inputs, for a total cost of 26. This cost is slightly higher than the cost of the 
circuit derived in Example 2.17. 


Example 2.19 


These examples show that the complexities of the best SOP or POS implementations 
of given functions may be quite different. For the functions in Figures 2.64 and 2.65, the 
SOP form gives better results. But if we are interested in implementing the complements 
of the four functions in these figures, then the POS form would be less costly. 

Sophisticated CAD tools used to synthesize logic functions will automatically perform 
the types of optimizations illustrated in the preceding examples. 


2.16 CONCLUDING REMARKS 


In this chapter we introduced the concept of logic circuits. We showed that such circuits can 
be implemented using logic gates and that they can be described using a mathematical model 
called Boolean algebra. Because practical logic circuits are often large, it is important to 
have good CAD tools to help the designer. We introduced the Verilog hardware description 
language that can be used to specify circuits for use in a CAD tool. We urge the reader to 
start using CAD software for the design of logic circuits as soon as possible. 

This chapter has attempted to provide the reader with an understanding of some aspects 
of synthesis and optimization for logic functions. Now that the reader is comfortable with 
the fundamental concepts, we can examine digital circuits of a more sophisticated nature. 
The next chapter describes circuits that perform arithmetic operations, which are a key part 
of computers. 


2.17 EXAMPLES OF SOLVED PROBLEMS 


This section presents some typical problems that the reader may encounter, and shows how 
such problems can be solved. 
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Example 2.20 


Problem: Determine if the following equation is valid 
XX3 + X9x3 + X1X2 = XyX2 +x x3 + X73 


Solution: The equation is valid if the expressions on the left- and right-hand sides represent 
the same function. To perform the comparison, we could construct a truth table for each 
side and see if the truth tables are the same. An algebraic approach is to derive a canonical 
sum-of-products form for each expression. 

Using the fact that x + x = 1 (theorem 8b), we can manipulate the left-hand side as 
follows: 


LHS = X1%3 + x2x3 + x1 X2 
= XX + X2)X3 + (1 + 1 )X2x3 + X1X2(%3 + X3) 
= X1XIX3 + XAIAZ A XyHIXZ HP Aj AQX3 Pb AXXIAZ + X1XX3 
These product terms represent the minterms 2, 0, 7, 3, 5, and 4, respectively. 
For the right-hand side we have 
RHS => X1X2 + X1X3 + X7X3 


= X1X2 (x3 + X3) + xy (2 + X2)x3 + Oy + 2X1) 2X3 


= X1X2X3 + Xj X2X3 + Xj xX2XZz + XjXH2HZ + XjX2X3 + X1X2X3 


These product terms represent the minterms 3, 2, 7, 5, 4, and 0, respectively. Since both 
expressions specify the same minterms, they represent the same function; therefore, the 
equation is valid. Another way of representing this function is by }> m(0, 2, 3, 4, 5, 7). 


Example 2.21 


Problem: Design the minimum-cost product-of-sums expression for the function 
f (X1, X2, x3, X4) = D> m0, 2, 4, 5, 6, 7, 8, 10, 12, 14, 15). 


Solution: The function is defined in terms of its minterms. To find a POS expression we 
should start with the definition in terms of maxterms, which is f = TIM (1, 3, 9, 11, 13). 
Thus, 


f =M, -M3-Mo- Mi, - Mj3 
= (x1 +X +3 + X4) (Ky + X2 + 3 +.X4)(%1 + X20 +43 + X4) Ky + x2 + 3 + X4)(%1 + X2 + 03 + X4) 


We can rewrite the product of the first two maxterms as 


My, - M3 = (x1 + x2 +X4 + x3) (x1 +2 +X4+%3) using commutative property 10b 


=x +X. + 2X4 + 13%3 using distributive property 12b 
=x, +%x.+%44+0 using theorem 8a 
=x, +x. +X4 using theorem 6b 


Similarly, Mg -M),; =X; + x2 + X4. Now, we can use Mo again, according to property 7a, 
to derive My -M\3 =X; + x3 +X4. Hence 


f = Oy +22 + X4) 1 + Xo + X4)(%] + 03 + X4) 
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Applying 12b again, we get the final answer 
f = (2 + X4) 1 + x3 + X4) 


Problem: A circuit that controls a given digital system has three inputs: x), x2, and x3. It Example 2.22 
has to recognize three different conditions: 


e Condition A is true if x3 is true and either x, is true or x2 is false 
e Condition B is true if x, is true and either x2 or x3 is false 


° Condition C is true if x2 is true and either x, is true or x3 is false 


The control circuit must produce an output of | if at least two of the conditions A, B, and C 
are true. Design the simplest circuit that can be used for this purpose. 


Solution: Using 1 for true and 0 for false, we can express the three conditions as follows: 
A= X3(X1 + X2) = X3X1 + x3X2 
B= xX) (%_ +%3) = xXyX. + .X1%3 
C = x2(%) +3) = xox + X0%3 
Then, the desired output of the circuit can be expressed as f = AB+AC+ BC. These 
product terms can be determined as: 
AB = (x3x1 + X3X2)(x1X2 + X1X3) 
= 3X XX + AZXpX]X3 + XZXQXX2 + X3X2X1X3 
= X3XjX2 +O + x3X2x1 +0 


= X1X2X3 


AC = (x3X1 + 3X2) (2x1 + 2X3) 
= X3BXpXQX] + XZXyX2H3 + XZX2X2QX] + X3X2X2X3 
= 43X42 +0+0+0 


= X1X2X3 


BC = (x1 X2 + 1X3) (2x1 + X2X3) 
= XXX2X, + XyX2X9X3z + Xp X3X2X, +X X3X2X3 
=0+0+ X1X3X2 + X1X3X2 
= X1X2X3 
Therefore, f can be written as 
f = X1X2X3 + XXQX3 + X1X2X3 
= x1 (%_ + X2)x3 + x1 X2 (x3 + X3) 
= X1xX3 + X1XQ 


= x1 (%3 + X2) 
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(a) Function A (b) Function B 
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(c) Function C (d) Function f 


Figure 2.66 — The Venn diagrams for Example 2.23. 


Example 2.23 Problem: Solve the problem in Example 2.22 by using Venn diagrams. 


Solution: The Venn diagrams for functions A, B, and C in Example 2.22 are shown in parts 
a toc of Figure 2.66. Since the function f has to be true when two or more of A, B, and C 
are true, then the Venn diagram for f is formed by identifying the common shaded areas in 
the Venn diagrams for A, B, and C. Any area that is shaded in two or more of these diagrams 
is also shaded in f, as shown in Figure 2.66d. This diagram corresponds to the function 


f =X 1X2 + X1x3 = X1 (XQ +3) 


Example 2.24 Problem: Use algebraic manipulation to derive the simplest sum-of-products expression 
for the function 


f= X2X3X4 + Xx3%4 + XY XNG 
Solution: Applying the consensus property 17a to the first two terms yields 


Sf = 2X34 + X1XZX4 + XQN4X|AX4 + XY XAG 


= X9X3X4 + XX3ZX4 + XyX2K4 +X X2H4 
Now, using the combining property 14a for the last two terms gives 
f= X2X3Xq + XyA3X4q +X Xq 
Finally, using the absorption property 13a produces 


f = x2X3x4 + X1X4 
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Problem: Use algebraic manipulation to derive the simplest product-of-sums expression Example 2.25 
for the function 


f = Oy + xy +3) +X. + X4) (1 +43 + 4) 
Solution: Applying the consensus property 17D to the first two terms yields 
F = Br + 2 + 3) + Xp +X 4) + 3 +.%1 + X4) 1 +43 + X4) 
= (X1 + x2 + 13) +X + X4) 1 + 2X3 + X4) (K+ x3 + X4) 
Now, using the combining property 14d for the last two terms gives 
Sf = (1 + x2 +03) 1 + X2 + X4) (1 + 3) 
Finally, using the absorption property 13d on the first and last terms produces 


Sf = (% +X. + X4)(%1 + x3) 


Problem: Determine the minimum-cost SOP and POS expressions for the function Example 2.26 
f 1, X2,.%3, X4) = D2 m4, 6, 8, 10, 11, 12, 15) + DG, 5,7, 9). 


Solution: The function can be represented in the form of a Karnaugh map as shown in 
Figure 2.67a. Note that the location of minterms in the map is as indicated in Figure 2.53. 
To find the minimum-cost SOP expression, it is necessary to find the prime implicants that 
cover all Is in the map. The don’t-cares may be used as desired. Minterm me is covered 
only by the prime implicant x;x2, hence this prime implicant is essential and it must be 
included in the final expression. Similarly, the prime implicants x;x2 and x3x4 are essential 
because they are the only ones that cover mo and mys, respectively. These three prime 
implicants cover all minterms for which f = | except m2. This minterm can be covered 
in two ways, by choosing either x,x3x4 or x2x3X4. Since both of these prime implicants 
have the same cost, we can choose either of them. Choosing the former, the desired SOP 
expression is 


SF =X 1X2 + xX. + HZH4 + HBX 4 


These prime implicants are encircled in the map. 

The desired POS expression can be found as indicated in Figure 2.64b. In this case, 
we have to find the sum terms that cover all Os in the function. Note that we have written 
Os explicitly in the map to emphasize this fact. The term (x; + x2) is essential to cover the 
Os in squares 0 and 2, which correspond to maxterms Mp and M>. The terms (x3 + x4) and 
(x; + X2 +3 + x4) must be used to cover the Os in squares 13 and 14, respectively. Since 
these three sum terms cover all Os in the map, the POS expression is 


f = (x1 +.x2) (x3 + X4) 1 +X + X3 + x4) 


The chosen sum terms are encircled in the map. 

Observe the use of don’t-cares in this example. To get a minimum-cost SOP expression 
we assumed that all four don’t-cares have the value 1. But, the minimum-cost POS expres- 
sion becomes possible only if we assume that don’t-cares 3, 5, and 9 have the value 0 while 
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the don’t-care 7 has the value 1. This means that the resulting SOP and POS expressions are 
not identical in terms of the functions they represent. They cover identically all valuations 
for which f is specified as | or 0, but they differ in the valuations 3, 5, and 9. Of course, 
this difference does not matter, because the don’t-care valuations will never be applied as 
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X{X3X4 


X3X4 


X1Xy X1Xq 


(a) Determination of the SOP expression 


(x1 +X +X3 4X4) 


(xy at X>) 


(b) Determination of the POS expression 


Figure 2.67 — Karnaugh maps for Example 2.26. 


inputs to the implemented circuits. 


Example 2.27 


Problem: Use Karnaugh maps to find the minimum-cost SOP and POS expressions for the 


function 


fi... + X4) = XH 3X4 + 3X4 + X|XoNg + Hy ANZX4 


assuming that there are also don’t-cares defined as D = )°(9, 12, 14). 


2.17. EXAMPLES OF SOLVED PROBLEMS 


Solution: The Karnaugh map that represents this function is shown in Figure 2.68a. The 
map is derived by placing 1s that correspond to each product term in the expression used 
to specify f. The term x1x3x4 corresponds to minterms 0 and 4. The term x3x4 represents 
the third row in the map, comprising minterms 3, 7, 11, and 15. The term X,xX2x4 specifies 
minterms 1 and 3. The fourth product term represents the minterm 13. The map also 
includes the three don’t-care conditions. 

To find the desired SOP expression, we must find the least-expensive set of prime 
implicants that covers all 1s in the map. The term x3x4 is a prime implicant which must 
be included because it is the only prime implicant that covers the minterm 7; it also covers 
minterms 3, 11, and 15. Minterm 4 can be covered with either %1%3X4 or x2%3xX4. Both of 
these terms have the same cost; we will choose x|X3%4 because it also covers the minterm 0. 
Minterm | may be covered with either x|X2x3 or X2x4; we should choose the latter because 
its cost is lower. This leaves only the minterm 13 to be covered, which can be done with 


X{X4 


X3X4 


XoX4 


(a) Determination of the SOP expression 


(44+ x5 +X3+ Hs) 


(x3 + X4) 


(xX, + X4) 


(b) Determination of the POS expression 


Figure 2.68 — Karnaugh maps for Example 2.27. 
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either x|x4 Or x,x2 at equal costs. Choosing x;x4, the minimum-cost SOP expression is 
f = X3Xq + X1X3X4 + X2X4 + X1X4 


Figure 2.68b shows how we can find the POS expression. The sum term (%3 + x4) 
covers the Os in the bottom row. To cover the 0 in square 8 we must include (x; + x4). The 
remaining 0, in square 5, must be covered with (x; + X2 + x3 +4). Thus, the minimum- 
cost POS expression is 


SF = 3 + x4) (1 + 4) (X1 + X2 + 03 + X4) 


Example 2.28 


Problem: Consider the expression 
f = 53(81 + 52) + 5152 
Derive a minimum-cost SOP expression for f. 
Solution: Applying the distributive property 12a we can write 
f = 5183 + S283 + 8182 


Now, it is easy to see how f can be represented in the form of a Karnaugh map, as depicted 
in Figure 2.69. The figure shows that minterms from each of the above three product terms 
cover the bottom row of the Karnaugh map, leading to the minimum-cost expression 


f = 53+ 5152 


Example 2.29 


Write Verilog code that represents the logic circuit in Figure 2.70. Use only continuous 
assignment statements to specify the required functions. 


Solution: An example of the required Verilog code is shown in Figure 2.71. 


11 10 


S953 S153 S185 


Figure 2.69 = A Karnaugh map that represents the function in 
Example 2.28. 
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N 


Figure 2.70 The logic circuit for Example 2.29. 


module f_g (x, y, z, f, g); 


input x, y, z; 
output f, g; 
wire k; 


assign k = y “ z; 
assign g =k “x; 
assign f = (~k & z) | (k & x); 


endmodule 


Figure 2.71 Verilog code for Example 2.29. 


Consider the circuit shown in Figure 2.72. It includes two copies of the 2-to-1 multiplexer Example 2.30 
circuit from Figure 2.33, and the adder circuit from Figure 2.12. If the multiplexers’ select 

input m = 0, then this circuit produces the sum $ = a+ b. But if m = 1, then the circuit 

produces $ = c + d. By using multiplexers, we are able to share one adder for generating 

the two different sums a+b and c+d. Sharing a subcircuit for multiple purposes is 

commonly-used in practice, although usually with larger subcircuits than our one-bit adder. 

Write Verilog code for the circuit in Figure 2.72. Use the hierarchical style of Verilog 

code illustrated in Figure 2.47, including two instances of a Verilog module for the 2-to-1 

multiplexer subcircuit, and one instance of the adder subcircuit. 


Solution: An example of the required Verilog code is shown in Figure 2.73. We should 
mention that both the shared module and the adder module have input ports named a and b. 
This does not cause any conflict because the name of a signal declared in a Verilog module 
is limited to the scope of that module. 
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Figure 2.72 The circuit for Example 2.30. 


module shared (a, b, c, d, m, s1, sO); 
input a, b, c, d, m; 
output s1, sO; 
wire wl, w2; 
mux2tol U1 (a,c, m, wl); 
mux2tol U2 (b, d, m, w2); 
adder U3 (wl, w2, s1, sO); 
endmodule 


module mux2tol (x1, x2, s, f); 
input x1, x2, s; 
output f; 
assign f = (~s & x1) | (s & x2); 
endmodule 


module adder (a, b, sl, sO); 
input a, b; 
output s1, sO; 
assign sl =a & b; 
assign sO =a “ b; 
endmodule 


Figure 2.73 Verilog code for Example 2.30. 
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In Chapter | we said that several types of integrated circuit chips are available for imple- 
mentation of logic circuits, and that field-programmable gate arrays (FPGAs) are commonly 
used. In an FPGA, logic functions are not implemented directly using AND, OR, and NOT 
gates. Instead, an FPGA implements logic functions by using a type of circuit element 
called lookup tables (LUTs). A LUT can be programmed by the user to implement any 
logic function of its inputs. Thus, if a LUT has three inputs, then it can implement any 
logic function of these three inputs. We describe FPGAs and lookup tables in detail in 
Appendix B. Consider the four-input function 


Sf = Xp X2X4 + X2X3X4 + X1XINZ 
Show how this function can be implemented in an FPGA that has three-input LUTs. 
Solution: A straightforward implementation requires four 3-input LUTs. Three of these 
LUTs are used for the three-input AND operations, and the final LUT implements the three- 


input OR operation. It is also possible to apply logic synthesis techniques that result in 
fewer LUTs. We discuss these techniques in Chapter 8 (see Example 8.19). 


Example 2.31 


PROBLEMS 


Answers to problems marked by an asterisk are given at the back of the book. 


2.1. Use algebraic manipulation to prove that x + yz = (x+y) - (x +z). Note that this is the 


distributive rule, as stated in identity 12b in Section 2.5. 


2.2 Use algebraic manipulation to prove that (x + y)- («+ y) =x. 


2.3 Use algebraic manipulation to prove that xy + yz-+x%z = xy + Xz. Note that this is the 


consensus property 17a in Section 2.5. 


2.4 Use the Venn diagram to prove the identity in Problem 2.3. 


2.5 Use the Venn diagram to prove DeMorgan’s theorem, as given in expression 15b in Sec- 


tion 2.5. 


2.6 Use the Venn diagram to prove that 


(x1 +X +.%3) + (Xy + 2X2 +23) = xy + X2 


*2.7 Determine whether or not the following expressions are valid, i.e., whether the left- and 


right-hand sides represent the same function. 

(a) X1x3 + X1X2%3 + XxX. + XyX2 = X2x3 + XpX3 + XZ + X1x2x3 

(b) x13 + x2x3 + X2X3 = (Ky + X2 + x3) (4K + XQ + 2X3) (X] + X2 +3) 
(C) (1 + 03) (Hy +X +3) (%1 + X2) = (1 + X2)(X2 + 3) (1 + X3) 


2.8 Draw a timing diagram for the circuit in Figure 2.24a. Show the waveforms that can be 


observed on all wires in the circuit. 


2.9 Repeat Problem 2.8 for the circuit in Figure 2.24b. 
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2.10 Use algebraic manipulation to show that for three input variables x,, x2, and x3 
Y\m(1, 2,3, 4,5, 6, 7) =m +22 +3 
2.11 Use algebraic manipulation to show that for three input variables x), x2, and x3 
TIM (0, 1, 2, 3,4, 5, 6) = x)x2x3 


*2.12 Use algebraic manipulation to find the minimum sum-of-products expression for the func- 
tion f = xyx3 + xX. + Xp x2x3 +X X2X3. 


2.13 Use algebraic manipulation to find the minimum sum-of-products expression for the func- 
tion f = xy X2X3 + xy x2xX4 + X1X2X3X4. 


2.14 Use algebraic manipulation to find the minimum product-of-sums expression for the func- 
tion f = (4 +3 +.x4) + (x +X. +23) + 1 + X2 +3 + x4). 


*2.15 Use algebraic manipulation to find the minimum product-of-sums expression for the func- 
tion f = (4) + x2 +23) + (1 + X2 +43) + 1 +X. +43) > (1 + 2 +3). 


2.16 (a) Show the location of all minterms in a three-variable Venn diagram. 
(b) Show a separate Venn diagram for each product term in the function f = x,X2x3 + 
X1X2 +X x3. Use the Venn diagram to find the minimal sum-of-products form of f. 


2.17 Represent the function in Figure 2.23 in the form of a Venn diagram and find its minimal 
sum-of-products form. 


2.18 Figure P2.1 shows two attempts to draw a Venn diagram for four variables. For parts (a) 
and (b) of the figure, explain why the Venn diagram is not correct. (Hint: The Venn diagram 
must be able to represent all 16 minterms of the four variables.) 


(aN 
Y(Ay 
Wy 


(a) (b) 


Figure P2.1_ = Two attempts to draw a four-variable Venn diagram. 


2.19 Figure P2.2 gives a representation of a four-variable Venn diagram and shows the location 
of minterms mo, 7m, and mz. Show the location of the other minterms in the diagram. 
Represent the function f = X)X2x3X4 + X1X2x3X4 + X|xX2 on this diagram. 
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Figure P2.2 =A four-variable Venn diagram. 


Design the simplest sum-of-products circuit that implements the function f (x1, x2, x3) = 
> m(3, 4, 6, 7). 


Design the simplest sum-of-products circuit that implements the function f (x1, x2, x3) = 
> m1, 3, 4, 6, 7). 

Design the simplest product-of-sums circuit that implements the function f (x, x2, x3) = 
TIM (0, 2, 5). 

Design the simplest product-of-sums expression for the function f (x1, x2, x3) = 1M (0, 1, 
5,7). 


Derive the simplest sum-of-products expression for the function f (x1, x2, x3, x4) = 
X1X3X4 + XX3X4 + NX X2N3. 


Use algebraic manipulation to derive the simplest sum-of-products expression for the func- 
tion f (x1, %2, 43, X4, X5) = X1X3X5 + XyX3X4 + Xp X4Ns + XjX2X3X5. (Hint: Use the consensus 
property 17a.) 


Use algebraic manipulation to derive the simplest product-of-sums expression for the func- 
tion f (x1, X2,.%3, X4) = (X1 +3 + X4) (Xo + 3 + X4)(X1 + X2 +3). (Hint: Use the con- 
sensus property 17b.) 


Use algebraic manipulation to derive the simplest product-of-sums expression for the 
function f (x1, x2, 3, %4, x5) = (%2 +3 +5) (xy + 3 + X5)(X1 + X2 +. X5) (41 +. X4 + X5). 
(Hint: Use the consensus property 17D.) 


Design the simplest circuit that has three inputs, x), x2, and x3, which produces an output 
value of 1 whenever two or more of the input variables have the value 1; otherwise, the 
output has to be 0. 


Design the simplest circuit that has three inputs, x), x2, and x3, which produces an output 
value of 1 whenever exactly one or two of the input variables have the value 1; otherwise, 
the output has to be 0. 


Design the simplest circuit that has four inputs, x1, x2, x3, and x4, which produces an output 
value of | whenever three or more of the input variables have the value 1; otherwise, the 
output has to be 0. 
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For the timing diagram in Figure P2.3, synthesize the function f (x,, x2, x3) in the simplest 
sum-of-products form. 


xy 


——————— re 


ore 


a 
Ne 
or 


x3 


——> Time 


Figure P2.3. A timing diagram representing a logic function. 


For the timing diagram in Figure P2.3, synthesize the function f (x1, x2, x3) in the simplest 
product-of-sums form. 


For the timing diagram in Figure P2.4, synthesize the function f (x;, x2, x3) in the simplest 
sum-of-products form. 


x) 


a. 


Core 


Me 
Ne 
Cor 


x3 


—» Time 


Figure P2.4 —A timing diagram representing a logic function. 


For the timing diagram in Figure P2.4, synthesize the function f (x,, x2, x3) in the simplest 
product-of-sums form. 


Design a circuit with output f and inputs x;, xo, yj, and yo. Let X = xjxo and Y = yiyo 
represent two 2-digit binary numbers. The output f should be 1 if the numbers represented 
by X and Y are equal. Otherwise, f should be 0. 
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(a) Show the truth table for f. 
(b) Synthesize the simplest possible product-of-sums expression for f. 


Repeat Problem 2.35 for the case where f should be | only if X > Y. 
(a) Show the truth table for f. 

(b) Show the canonical sum-of-products expression for f- 

(c) Show the simplest possible sum-of-products expression for f. 


Find the minimum-cost SOP and POS forms for the function f (x1, x2, x3) = > m(1, 2, 3, 5). 
Repeat Problem 2.37 for the function f (x), x2,.x3) = }-> m(1, 4, 7) + D(2, 5). 


Repeat Problem 2.37 for the function f(x1,...,x4) = IM (0, 1, 2, 4, 5, 7, 8, 9, 10, 12, 
14, 15). 


Repeat Problem 2.37 for the function f(x), ...,x4) = )> m(0, 2, 8, 9, 10, 15) + DC, 3, 
6, 7). 


Repeat Problem 2.37 for the function f (x1, ...,%5) = IM (1, 4, 6, 7, 9, 12,15, 17, 20, 21, 
22, 23, 28, 31). 


Repeat Problem 2.37 for the function f(x, ...,%5) = ye m(O, 1, 3, 4, 6, 8, 9, 11, 13, 14, 
16, 19, 20, 21, 22, 24, 25) + D(5, 7, 12, 15, 17, 23). 


Repeat Problem 2.37 for the function f (x;,...,x5) = » m(1, 4, 6, 7,9, 10, 12, 15, 17, 19, 
20, 23, 25, 26, 27, 28, 30, 31) + D(8, 16, 21, 22). 


Find 5 three-variable functions for which the product-of-sums form has lower cost than the 
sum-of-products form. 


A four-variable logic function that is equal to | if any three or all four of its variables are 
equal to | is called a majority function. Design a minimum-cost SOP circuit that implements 
this majority function. 


Derive a minimum-cost realization of the four-variable function that is equal to | if exactly 
two or exactly three of its variables are equal to 1; otherwise it is equal to 0. 


Prove or show a counter-example for the statement: If a function f has a unique minimum- 
cost SOP expression, then it also has a unique minimum-cost POS expression. 


A circuit with two outputs has to implement the following functions 
f@1,...,x%4) = Smo, 2,4, 6, 7,9) + D(10, 11) 


B(X1,...,X4) = Yi m2, 4,9, 10, 15) + D(O, 13, 14) 


Design the minimum-cost circuit and compare its cost with combined costs of two circuits 
that implement f and g separately. Assume that the input variables are available in both 
uncomplemented and complemented forms. 


Repeat Problem 2.48 for the following functions 
f(x, .-.,%5) = Dom, 4, 5, 11, 27, 28) + DO, 12, 14, 15, 20, 31) 
g(X1,...,%5) = >> m(0, 1, 2,4, 5, 8, 14, 15, 16, 18, 20, 24, 26, 28, 31) + D(10, 11, 12, 27) 
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Consider the function f = x3x5 + X1xo%X4 + X1X2X4 + X1xX3H4 + Xp A3X4 + Xp AQNs + XjX2X5. 
Derive a minimum-cost POS expression for this function. 


Implement the function in Figure 2.31 using only NAND gates. 
Implement the function in Figure 2.31 using only NOR gates. 
Implement the circuit in Figure 2.39 using NAND and NOR gates. 


Design the simplest circuit that implements the function f (x1, x2,.+3) = Ym, 4, 6,7) 
using NAND gates. 


Design the simplest circuit that implements the function f (x1, x2, x3) = >> m(/, 3, 4, 6, 7) 
using NAND gates. 


Repeat Problem 2.54 using NOR gates. 
Repeat Problem 2.55 using NOR gates. 


(a) Use a schematic capture tool (which can be downloaded from the Internet; for example, 
from www.altera.com) to draw schematics for the following functions 


Si = XoX3Xq + Xp xx4 + Xp x7H3 + X1X2H3 
fo = 2X4 + XX. + X23 
(b) Use functional simulation to prove that f; = fo. 
(a) Use a schematic capture tool to draw schematics for the following functions 
fi = Gi + X2 + X4) + 2 +3 + X4) + 1 +3 +X) + 1 +.%3 + Xa) 
Sr = (2 +4) + (3 + X4) - 1 + Xa) 
(b) Use functional simulation to prove that f; = fo. 
Write Verilog code to implement the circuit in Figure 2.32a using the gate-level primitives. 
Repeat Problem 2.60 for the circuit in Figure 2.325. 


Write Verilog code to implement the function f (x1, x2, x3) = >> m(1, 2, 3, 4, 5, 6) using the 
gate-level primitives. Ensure that the resulting circuit is as simple as possible. 


Write Verilog code to implement the function f (x), x2, x3) = >> m(0, 1, 3, 4, 5, 6) using the 
continuous assignment. 


(a) Write Verilog code to describe the following functions 
Si = 1X3 + X2X3 + .X3X4 + XX + x X4 
fa = Oy +3) + Oy + 2 + X4) + 2 +3 + X4) 
(b) Use functional simulation to prove that f; = fr. 
Consider the following Verilog statements 
fl = (xl & x3) | (~xl & ~x3) | (x2 & x4) | (~x2 & ~x4); 


f2 = (xl & x2 & ~x3 & ~x4) | (~x!] & ~x2 & x3 & x4) | 
(xl & ~x2 & ~x3 & x4) | (~xl & x2 & x3 & ~x4); 
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(a) Write complete Verilog code to implement f1 and f2. 
(b) Use functional simulation to prove that f1 = f2. 


Consider the logic expressions 


Sf = XyX2X5 + X|X2X4Xs + ANH 4X5 + Hy XNZX4 + X|XINZXS + XIHZX4Xs + Xj XIN3X4Ns 
& = X2X3X4 + XQXZX4Xs + XpAZXgXs + NyAQX4NXs + XX3X4X5 +X X2NZX5 + Xj XQN3X4X5 


Prove or disprove that f = g. 


Repeat Problem 2.66 for the following expressions 


SF = 1X23 + X2Xq + XY X2X4 + NQNZX4q + Hy XIX3 
& = (H2 +.X3 + X4) + X2 + X4) (X + XZ 7X4) A XQ +3) (C1 + X2 + Xa) 
Repeat Problem 2.66 for the following expressions 


Sf = X2X3Xq + X2x3 + X2H4 + XXQK4 +X XQNZN5 
& = (X% +.%3 +.X4)(X2 + X4 + 5) (1 + XQ +.X3) 2 + x3 + X4 + Xs) 


A circuit with two outputs is defined by the logic functions 


f = X1X2X3 + XoX4 + HoA3.H4 + Hy xxzx4 
& = XpX3X4q + Xj AQXq + XyAZX4 + XXZXq 


Derive a minimum-cost implementation of this circuit. What is the cost of your circuit? 


Repeat Problem 2.69 for the functions 


Sf = (& + x2 +3) +3 + X4) (Ky +X. + 03) Oy + XQ + .X4)(X1 + X2 + X4) 
& = (Hy +X. +3) (HK + Xo + X4) (X2 + X3 + X4) (1 + Xz +3 + X4) 


A given system has four sensors that can produce an output of 0 or 1. The system operates 
properly when exactly one of the sensors has its output equal to 1. An alarm must be raised 
when two or more sensors have the output of 1. Design the simplest circuit that can be used 
to raise the alarm. 


Repeat Problem 2.71 for a system that has seven sensors. 


Find the minimum-cost circuit consisting only of two-input NAND gates for the function 
fQ1,...,%4) = > m(0, 1, 2,3, 4, 6, 8,9, 12). Assume that the input variables are avail- 
able in both uncomplemented and complemented forms. (Hint: Consider the complement 
of the function.) 


Repeat Problem 2.73 for the function f (x), ...,x4) = > m(2, 3, 6, 8, 9, 12). 


Find the minimum-cost circuit consisting only of two-input NOR gates for the function 
f (1, -.., x4) = ¥o m6, 7, 8, 10, 12, 14, 15). Assume that the input variables are available 
in both uncomplemented and complemented forms. (Hint: Consider the complement of 
the function.) 


Repeat Problem 2.75 for the function f (x, ...,.x4) = >> m(2, 3, 4, 5,9, 10, 11, 12, 13, 15). 
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Consider the circuit in Figure P2.5, which implements functions f and g. What is the cost of 
this circuit, assuming that the input variables are available in both true and complemented 
forms? Redesign the circuit to implement the same functions, but at as low a cost as 
possible. What is the cost of your circuit? 


all 
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Figure P2.5 = Circuit for Problem 2.78. 
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2.78 Repeat Problem 2.78 for the circuit in Figure P2.6. Use only NAND gates in your circuit. 


Figure P2.6 — Circuit for Problem 2.79. 
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NUMBER REPRESENTATION 
AND ARITHMETIC CIRCUITS 


CHAPTER OBJECTIVES 


In this chapter you will learn about: 

e Representation of numbers in computers 

e Circuits used to perform arithmetic operations 
e Performance issues in large circuits 


e Use of Verilog to specify arithmetic circuits 
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In this chapter we will discuss logic circuits that perform arithmetic operations. We will explain how numbers 
can be added, subtracted, and multiplied. We will also show how to write Verilog code to describe the 
arithmetic circuits. These circuits provide an excellent platform for illustrating the power and versatility of 
Verilog in specifying complex logic-circuit assemblies. The concepts involved in the design of arithmetic 
circuits are easily applied to a wide variety of other circuits. 

Before tackling the design of arithmetic circuits, it is necessary to discuss how numbers are repre- 
sented in digital systems. In Chapter 1 we introduced binary numbers and showed how they can be expressed 
using the positional number representation. We also discussed the conversion process between decimal and 
binary number systems. In Chapter 2 we dealt with logic variables in a general way, using variables to 
represent either the states of switches or some general conditions. Now we will use the variables to represent 
numbers. Several variables are needed to specify a number, with each variable corresponding to one digit of 
the number. 


3.1 POSITIONAL NUMBER REPRESENTATION 


When dealing with numbers and arithmetic operations, it is convenient to use standard 
symbols. Thus to represent addition we use the plus (++) symbol, and for subtraction we use 
the minus (—) symbol. In Chapter 2 we used the + symbol mostly to represent the logical 
OR operation. Even though we will now use the same symbols for two different purposes, 
the meaning of each symbol will usually be clear from the context of the discussion. In 
cases where there may be some ambiguity, the meaning will be stated explicitly. 


3.1.1 UNSIGNED INTEGERS 


The simplest numbers to consider are the integers. We will begin by considering positive 
integers and then expand the discussion to include negative integers. Numbers that are 
positive only are called unsigned, and numbers that can also be negative are called signed. 
Representation of numbers that include a radix point (real numbers) is discussed later in 
the chapter. 

As explained in Section 1.5.1, an n-bit unsigned number 


B= by-1bp-2--- by bo 
represents an integer that has the value 


V(BY = by 2! hy a x ee hy Oy KD? [3.1] 
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3.1.2 OCTAL AND HEXADECIMAL REPRESENTATIONS 


The positional number representation can be used for any radix. If the radix is r, then the 
number 


K= kn—1kn—2 ie kiko 


has the value 

n-1 

V(K)= ok xr 

i=0 
Our interest is limited to those radices that are most practical. We will use decimal numbers 
because they are used by people, and we will use binary numbers because they are used by 
computers. In addition, two other radices are useful—8 and 16. Numbers represented with 
radix 8 are called octal numbers, while radix-16 numbers are called hexadecimal numbers. 
In octal representation the digit values range from 0 to 7. In hexadecimal representation 
(often abbreviated as hex), each digit can have one of 16 values. The first ten are denoted 
the same as in the decimal system, namely, 0 to 9. Digits that correspond to the decimal 
values 10, 11, 12, 13, 14, and 15 are denoted by the letters, A, B, C, D, E, and F. Table 3.1 
gives the first 18 integers in these number systems. 

In computers the dominant number system is binary. The reason for using the octal and 

hexadecimal systems is that they serve as a useful shorthand notation for binary numbers. 
One octal digit represents three bits. Thus a binary number is converted into an octal number 


Table 3.1. == Numbers in different systems. 


Decimal Binary Octal Hexadecimal 
00 00000 00 00 
01 00001 01 01 
02 00010 02 02 
03 00011 03 03 
04 00100 04 04 
05 00101 05 05 
06 00110 06 06 
07 00111 07 07 
08 01000 10 08 
09 01001 11 09 
10 01010 12 OA 
11 01011 13 OB 
12 01100 14 0c 
13 01101 15 0D 
14 01110 16 OE 
15 01111 17 OF 
16 10000 20 10 
il7/ 10001 21 11 


18 10010 Pp; 12 
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by taking groups of three bits, starting from the least-significant bit, and replacing them 
with the corresponding octal digit. For example, 101011010111 is converted as 


101 O11 010 111 
ae Qaeew Qe ae 
5 2 7 


which means that (101011010111). = (5327). If the number of bits is not a multiple of 
three, then we add Os to the left of the most-significant bit. For example, (10111011). = 
(273)g because of the grouping 


010 111 O11 
ng ae Qe 
2: 7 


Conversion from octal to binary is just as straightforward; each octal digit is simply replaced 
by three bits that denote the same value. 

Similarly, a hexadecimal digit represents four bits. For example, a 16-bit number is 
represented by four hex digits, as in 


(1010111100100101), = (AF25)1¢ 
using the grouping 
1010 1111 0010 0101 
—— =—— 


Sa ——— 
A F 2 5 


Zeros are added to the left of the most-significant bit if the number of bits is not a multiple 
of four. For example, (1101101000) = (368) 6 because of the grouping 


0011 0110 1000 
—{ — =’ “SSSm 
3 6 8 


Conversion from hexadecimal to binary involves straightforward substitution of each hex 
digit by four bits that denote the same value. 

Binary numbers used in modern computers often have 32 or 64 bits. Written as binary 
n-tuples (sometimes called bit vectors), such numbers are awkward for people to deal with. 
It is much simpler to deal with them in the form of 8- or 16-digit hex numbers. Because 
the arithmetic operations in a digital system usually involve binary numbers, we will focus 
on circuits that use such numbers. We will sometimes use the hexadecimal representation 
as a convenient shorthand description. 

We have introduced the simplest numbers—unsigned integers. It is necessary to be 
able to deal with several other types of numbers. We will discuss the representation of 
signed numbers, fixed-point numbers, and floating-point numbers later in this chapter. But 
first we will examine some simple circuits that operate on numbers to give the reader a 
feeling for digital circuits that perform arithmetic operations and to provide motivation for 
further discussion. 


3.2 ADDITION OF UNSIGNED NUMBERS 
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Binary addition is performed in the same way as decimal addition except that the values 
of individual digits can be only 0 or 1. In Chapter 2, we already considered the addition 
of 2 one-bit numbers, as an example of a simple logic circuit. Now, we will consider this 
task in the context of general adder circuits. The one-bit addition entails four possible 
combinations, as indicated in Figure 3.1a. Two bits are needed to represent the result of the 
addition. The right-most bit is called the sum, s. The left-most bit, which is produced as 
a carry-out when both bits being added are equal to 1, is called the carry, c. The addition 
operation is defined in the form of a truth table in part (b) of the figure. The sum bit s is 
the XOR function. The carry c is the AND function of inputs x and y. A circuit realization 
of these functions is shown in Figure 3.1c. This circuit, which implements the addition of 
only two bits, is called a half-adder. 


0 0 1 
+y +0 +1 +0 +1 
Cs 01 01 10 


Carry Sum 
x y c RY 
0 0 0 0 
0 1 0 1 
1 0 0 1 
1 1 1 0 
(b) Truth table 
x 
S 
y 
xX — AY 
y —— Cc 
Cc 
(c) Circuit (d) Graphical symbol 


Figure 3.1) Half-adder. 
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Generated carries ——» 1110 . [Cen 

X = X4XZX_X1Xq O1ll11 (15) 19 Sahib X; 

+¥ = ygy3VoV1 No +01010 + (10) i wee 

S = 8453545189 11001 (25) 19 i ps 5; 
(a) An example of addition (b) Bit position i 


Figure 3.2 = Addition of multibit numbers. 


A more interesting case is when larger numbers that have multiple bits are involved. 
Then it is still necessary to add each pair of bits, but for each bit position i, the addition 
operation may include a carry-in from bit position i — 1. 

Figure 3.2a presents an example of the addition operation. The two operands are X = 
(01111). = C15);9 and Y = (01010), = (10);9. Five bits are used to represent X and 
Y, making it possible to represent integers in the range from 0 to 31; hence the sum 
S =X + Y = (25),) canalso be denoted as a five-bit integer. Note the labeling of individual 
bits, such that X = x4x3x2x1xo and Y = y4y3y2y1 yo. The figure shows, in a blue color, the 
carries generated during the addition process. For example, a carry of 0 is generated when 
xo and yo are added, a carry of 1 is produced when x; and y, are added, and so on. 

In Chapter 2 we designed logic circuits by first specifying their behavior in the form 
of a truth table. This approach is impractical in designing an adder circuit that can add the 
five-bit numbers in Figure 3.2. The required truth table would have 10 input variables, 5 
for each number X and Y. It would have 2! = 1024 rows! A better approach is to consider 
the addition of each pair of bits, x; and y,;, separately. 

For bit position 0, there is no carry-in, and hence the addition is the same as for 
Figure 3.1. For each other bit position i, the addition involves bits x; and y;, and a carry-in 
cj, as illustrated in Figure 3.2b. This observation leads to the design of a logic circuit that 
has three inputs x;, y;, and c;, and produces the two outputs s; and c;,,. The required truth 
table is shown in Figure 3.3a. The sum bit, s;, is the modulo-2 sum of x;, y;, and c;. The 
carry-out, Ci+1, is equal to 1 if the sum of x;, y;, and c; is equal to either 2 or 3. Karnaugh 
maps for these functions are shown in part (b) of the figure. For the carry-out function the 
optimal sum-of-products realization is 


Cit = XiVi TF XiCi TH ViCi 


For the s; function a sum-of-products realization is 


Sj = XiVICi TH XIV {Ci TF XIY{Ci TF XiViCi 


Amore attractive way of implementing this function is by using the XOR gates, as explained 
below. 
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cj oo O1 i11 10 


0 0 
0 0 
0 1 
0 1 
1 0 
1 0 
1 1 
it, <L 


(a) Truth table 


rPoOorOoOrFrOoOrFr oO 
Rare RK OF CC oO 
Fe COOrOCOrFrrFK CC 


(b) Karnaugh maps 


i+] 


(c) Circuit 


Figure 3.3 Full-adder. 
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Use of XOR Gates 

As shown in Chapter 2, the XOR function of two variables is defined as x; @ x2 = 
X1X2 + xX Xz. The preceding expression for the sum bit can be manipulated into a form that 
uses only XOR operations as follows 


Si = iy; + xiy)Ci + iy; + xiY)Ci 
= (x1 Oye; + Wi O Wc 
= (x; B yi) B cj 


The XOR operation is associative; hence we can write 
Si = Xi BY; OC; 


Therefore, a three-input XOR operation can be used to realize s;. 

The XOR operation generates as an output a modulo-2 sum of its inputs. Thus, the 
output is equal to | if an odd number of inputs have the value 1, and it is equal to 0 
otherwise. For this reason the XOR is sometimes referred to as the odd function. Observe 
that the XOR has no minterms that can be combined into a larger product term, as evident 
from the checkerboard pattern for function s; in the map in Figure 3.3b. The logic circuit 
implementing the truth table in Figure 3.3a is given in Figure 3.3c. This circuit is known 
as a full-adder. 

Another interesting feature of XOR gates is that a two-input XOR gate can be thought 
of as using one input as a control signal that determines whether the true or complemented 
value of the other input will be passed through the gate as the output value. This is clear 
from the definition of XOR, where x; ® y; = xy + xy. Consider x to be the control input. 
Then if x = 0, the output will be equal to the value of y. But if x = 1, the output will be 
equal to the complement of y. In the derivation above, we used algebraic manipulation 
to derive s; = (x; ® y;) ® c;. We could have obtained the same expression immediately 
by making the following observation. In the top half of the truth table in Figure 3.3a, c; 
is equal to 0, and the sum function s; is the XOR of x; and y;. In the bottom half of the 
table, c; is equal to 1, while s; is the complemented version of its top half. This observation 
leads directly to our expression using 2 two-input XOR operations. We will encounter an 
important example of using XOR gates to pass true or complemented signals under the 
control of another signal in Section 3.3.3. 

In the preceding discussion we encountered the complement of the XOR operation, 
which we denoted as x ® y. This operation is used so commonly that it is given the distinct 
name XNOR. A special symbol, ©, is often used to denote the XNOR operation, namely 


XxOy=x@y 


The XNOR is sometimes also referred to as the coincidence operation because it produces 
the output of 1 when its inputs coincide in value; that is, they are both 0 or both 1. 
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(a) Block diagram 


(b) Detailed diagram 


Figure 3.4 A decomposed implementation of the full-adder circuit. 


3.2.1 DECOMPOSED FULL-ADDER 


In view of the names used for the circuits, one can expect that a full-adder can be constructed 
using half-adders. This can be accomplished by creating a multilevel circuit given in 
Figure 3.4. It uses two half-adders to form a full-adder. The reader should verify the 
functional correctness of this circuit. 


3.2.2 RIPPLE-CARRY ADDER 


To perform addition by hand, we start from the least-significant digit and add pairs of digits, 
progressing to the most-significant digit. If a carry is produced in position i, then this carry is 
added to the operands in position i + 1. The same arrangement can be used in a logic circuit 
that performs addition. For each bit position we can use a full-adder circuit, connected as 
shown in Figure 3.5. Note that to be consistent with the customary way of writing numbers, 
the least-significant bit position is on the right. Carries that are produced by the full-adders 
propagate to the left. 

When the operands X and Y are applied as inputs to the adder, it takes some time before 
the output sum, S, is valid. Each full-adder introduces a certain delay before its s; and cj+1 
outputs are valid. Let this delay be denoted as At. Thus the carry-out from the first stage, 
C1, arrives at the second stage At after the application of the xo and yo inputs. The carry-out 
from the second stage, c2, arrives at the third stage with a 2At delay, and so on. The signal 
Cn—1 18 valid after a delay of (n — 1) At, which means that the complete sum is available 
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Xn-1) Yn-1 Ay. ML Xo Yo 
al 
_ _ " ; nas " a _ 
Sn-1 Sy So 
MSB position LSB position 


Figure 3.5.) = Ann-bit ripple-carry adder. 


after a delay of nAt. Because of the way the carry signals “ripple” through the full-adder 
stages, the circuit in Figure 3.5 is called a ripple-carry adder. 

The delay incurred to produce the final sum and carry-out in a ripple-carry adder 
depends on the size of the numbers. When 32- or 64-bit numbers are used, this delay 
may become unacceptably high. Because the circuit in each full-adder leaves little room 
for a drastic reduction in the delay, it may be necessary to seek different structures for 
implementation of n-bit adders. We will discuss a technique for building high-speed adders 
in Section 3.4. 

So far we have dealt with unsigned integers only. The addition of such numbers does 
not require a carry-in for stage 0. In Figure 3.5 we included co in the diagram so that the 
ripple-carry adder can also be used for subtraction of numbers, as we will see in Section 3.3. 


3.2.3 DESIGN EXAMPLE 


Suppose that we need a circuit that multiplies an eight-bit unsigned number by 3. Let 
A = a7d6-+-a,a9 denote the number and P = pops ---pipo denote the product P = 3A. 
Note that 10 bits are needed to represent the product. 

A simple approach to design the required circuit is to use two ripple-carry adders to 
add three copies of the number A, as illustrated in Figure 3.6a. The symbol that denotes 
each adder is a commonly-used graphical symbol for adders. The letters x;, y;, 5;, and c; 
indicate the meaning of the inputs and outputs according to Figure 3.5. The first adder 
produces A + A = 2A. Its result is represented as eight sum bits and the carry from the 
most-significant bit. The second adder produces 2A + A = 3A. It has to be a nine-bit adder 
to be able to handle the nine bits of 2A, which are generated by the first adder. Because the 
y; inputs have to be driven only by the eight bits of A, the ninth input yg is connected to a 
constant 0. 

This approach is straightforward, but not very efficient. Because 3A = 2A + A, we 
can observe that 2A can be generated by shifting the bits of A one bit-position to the left, 
which gives the bit pattern a7dga5a4a342a,a90. According to Equation 3.1, this pattern is 
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A: a7 «2 2 A 


P=3A: Py Px P; 


(a) Naive approach 


A: a7 «+ a 


(b) Efficient design 


Figure 3.6 — Circuit that multiplies an eight-bit unsigned number by 3. 
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equal to 2A. Then a single ripple-carry adder suffices for implementing 3A, as shown in 
Figure 3.6b. This is essentially the same circuit as the second adder in part (a) of the figure. 
Note that the input xo is connected to a constant 0. Note also that in the second adder in part 
(a) of the figure the value of xo is always 0, even though it is driven by the least-significant 
bit, so, of the sum of the first adder. Because x9 = yo = dog in the first adder, the sum bit so 
will be 0, whether ado is O or 1. 


3.3 SIGNED NUMBERS 


In the decimal system the sign of a number is indicated by a + or — symbol to the left 
of the most-significant digit. In the binary system the sign of a number is denoted by the 
left-most bit. For a positive number the left-most bit is equal to 0, and for a negative number 
it is equal to 1. Therefore, in signed numbers the left-most bit represents the sign, and the 
remaining n — | bits represent the magnitude, as illustrated in Figure 3.7. It is important to 
note the difference in the location of the most-significant bit (MSB). In unsigned numbers 
all bits represent the magnitude of a number; hence all n bits are significant in defining 
the magnitude. Therefore, the MSB is the left-most bit, b,_;. In signed numbers there are 
n — | significant bits, and the MSB is in bit position b,_2. 


n—-1 


| Magnitude 
MSB 


Sion Magnitude 


0 denotes + 
1 denotes — MSB 


(b) Signed number 


Figure 3.7 —_ Formats for representation of integers. 
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3.3.1 NEGATIVE NUMBERS 


Positive numbers are represented using the positional number representation as explained 
in the previous section. Negative numbers can be represented in three different ways: 
sign-and-magnitude, 1’s complement, and 2’s complement. 


Sign-and-Magnitude Representation 

In the familiar decimal representation, the magnitude of both positive and negative 
numbers is expressed in the same way. The sign symbol distinguishes a number as being 
positive or negative. This scheme is called the sign-and-magnitude number representation. 
The same scheme can be used with binary numbers in which case the sign bit is 0 or 1 
for positive or negative numbers, respectively. For example, if we use four-bit numbers, 
then +5 = 0101 and —5 = 1101. Because of its similarity to decimal sign-and-magnitude 
numbers, this representation is easy to understand. However, as we will see shortly, this 
representation is not well suited for use in computers. More suitable representations are 
based on complementary systems, explained below. 


1’s Complement Representation 

In a complementary number system, the negative numbers are defined according to a 
subtraction operation involving positive numbers. We will consider two schemes for binary 
numbers: the 1’s complement and the 2’s complement. In the /’s complement scheme, an 
n-bit negative number, K, is obtained by subtracting its equivalent positive number, P, 
from 2” — 1; that is, K = (2” — 1) — P. For example, if n = 4, then K = (24-1)-P= 
(15)i0 — P = (1111)2 — P. If we convert +5 to a negative, we get —5 = 1111 — 0101 = 
1010. Similarly, +3 = 0011 and —3 = 1111 — 0011 = 1100. Clearly, the 1’s complement 
can be obtained simply by complementing each bit of the number, including the sign bit. 
While 1’s complement numbers are easy to derive, they have some drawbacks when used 
in arithmetic operations, as we will see in the next section. 


2’s Complement Representation 

In the 2’s complement scheme, a negative number, K, is obtained by subtracting its 
equivalent positive number, P, from 2”; namely, K = 2” — P. Using our four-bit example, 
—5 = 10000 — 0101 = 1011, and —3 = 10000 — 0011 = 1101. Finding 2’s complements 
in this manner requires performing a subtraction operation that involves borrows. However, 
we can observe that if K; is the 1’s complement of P and K2 is the 2’s complement of P, 
then 


K,=("-1)-P 
Ky =2"—P 


It follows that Ky = K, + 1. Thus a simpler way of finding a 2’s complement of a number 
is to add | to its 1’s complement because finding a 1’s complement is trivial. This is how 
2’s complement numbers are obtained in logic circuits that perform arithmetic operations. 

The reader will need to develop an ability to find 2’s complement numbers quickly. 
There is a simple rule that can be used for this purpose. 
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Rule for Finding 2’s Complements 

Given anumber B = b,_;b,_2-+- bj bo, its 2’s complement, K = ky, ky-2 +--+ kiko, can 
be found by examining the bits of B from right to left and taking the following action: copy 
all bits of B that are O and the first bit that is 1; then simply complement the rest of the bits. 

For example, if B = 0110, then we copy kp = bop = 0 and k; = b; = 1, and comple- 
ment the rest so that ky = b> = 0 and k; = b3 = 1. Hence K = 1010. As another example, 
if B = 10110100, then K = 01001100. We leave the proof of this rule as an exercise for 
the reader. 

Table 3.2 illustrates the interpretation of all 16 four-bit patterns in the three signed- 
number representations that we have considered. Note that for both sign-and-magnitude 
representation and for 1’s complement representation there are two patterns that represent 
the value zero. For 2’s complement there is only one such pattern. Also, observe that the 
range of numbers that can be represented with four bits in 2’s complement form is —8 to 
+7, while in the other two representations it is —7 to +7. 

Using 2’s-complement representation, an n-bit number B = b,_,b,_2-- +b, bo repre- 
sents the value 


V(B) = (—Dy-1 X 2"!) + Dan x 2"? + + by x 2! + dy x 2° [3.2] 


Thus the largest negative number, 100...00, has the value —2”~!. The largest positive 
number, 011... 11, has the value 2”~! — 1. 


Table 3.2 _Interpretation of four-bit signed integers. 


Sign and 
b3brb, bo magnitude 1’s complement 2’s complement 
O111 +7 +7 +7 
0110 +6 +6 +6 
0101 +5 +5 +5 
0100 +4 +4 +4 
0011 +3 +3 +3 
0010 +2 +2 +2 
0001 +1 +1 +1 
0000 +0 +0 +0 
1000 —0 —7 —8 
1001 -1 —6 -7 
1010 —2 —5 -—6 
1011 3 —4 —5 
1100 —4 —3 —4 
1101 —5 —2 —3 
1110 —6 -1 —2 


1111 =7/ —0 =1 
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3.3.2 ADDITION AND SUBTRACTION 


To assess the suitability of different number representations, it is necessary to investigate 
their use in arithmetic operations—particularly in addition and subtraction. We can illustrate 
the good and bad aspects of each representation by considering very small numbers. We will 
use four-bit numbers, consisting of a sign bit and three significant bits. Thus the numbers 
have to be small enough so that the magnitude of their sum can be expressed in three bits, 
which means that the sum cannot exceed the value 7. 

Addition of positive numbers is the same for all three number representations. It is 
actually the same as the addition of unsigned numbers discussed in Section 3.2. But there 
are significant differences when negative numbers are involved. The difficulties that arise 
become apparent if we consider operands with different combinations of signs. 


Sign-and-Magnitude Addition 

If both operands have the same sign, then the addition of sign-and-magnitude numbers 
is simple. The magnitudes are added, and the resulting sum is given the sign of the operands. 
However, if the operands have opposite signs, the task becomes more complicated. Then 
it is necessary to subtract the smaller number from the larger one. This means that logic 
circuits that compare and subtract numbers are also needed. We will see shortly that it 
is possible to perform subtraction without the need for this circuitry. For this reason, the 
sign-and-magnitude representation is not used in computers. 


1’s Complement Addition 

An obvious advantage of the 1’s complement representation is that a negative num- 
ber is generated simply by complementing all bits of the corresponding positive number. 
Figure 3.8 shows what happens when two numbers are added. There are four cases to 
consider in terms of different combinations of signs. As seen in the top half of the figure, 
the computation of 5 + 2 = 7 and (—5) + 2 = (—3) is straightforward; a simple addition 
of the operands gives the correct result. Such is not the case with the other two possibilities. 
Computing 5 + (—2) = 3 produces the bit vector 10010. Because we are dealing with 
four-bit numbers, there is a carry-out from the sign-bit position. Also, the four bits of the 


(+5) 0101 (-5) 1010 
+ (+2) + 0010 + (+2) + 0010 
(+7) 0111 (-3) 1100 
(+5) 0101 (-5) 1010 
+ (-2) +1101 + (-2) +1101 
(+3) 10010 (-7) 10111 
1 1 

0011 1000 


Figure 3.8 — Examples of 1’s complement addition. 
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result represent the number 2 rather than 3, which is a wrong result. Interestingly, if we 
take the carry-out from the sign-bit position and add it to the result in the least-significant 
bit position, the new result is the correct sum of 3. This correction is indicated in blue in 
the figure. A similar situation arises when adding (—5) + (—2) = (—7). After the initial 
addition the result is wrong because the four bits of the sum are 0111, which represents +7 
rather than —7. But again, there is a carry-out from the sign-bit position, which can be used 
to correct the result by adding it in the LSB position, as shown in Figure 3.8. 

The conclusion from these examples is that the addition of 1’s complement numbers 
may or may not be simple. In some cases a correction is needed, which amounts to an extra 
addition that must be performed. Consequently, the time needed to add two 1’s complement 
numbers may be twice as long as the time needed to add two unsigned numbers. 


2’s Complement Addition 

Consider the same combinations of numbers as used in the 1’s complement example. 
Figure 3.9 indicates how the addition is performed using 2’s complement numbers. Adding 
5+ 2 = 7 and (—5) + 2 = (—3) is straightforward. The computation 5 + (—2) = 3 gen- 
erates the correct four bits of the result, namely 0011. There is a carry-out from the sign-bit 
position, which we can simply ignore. The fourth case is (—5) + (—2) = (—7). Again, the 
four bits of the result, 1001, give the correct sum (—7). In this case also, the carry-out from 
the sign-bit position can be ignored. 

As illustrated by these examples, the addition of 2’s complement numbers is very 
simple. When the numbers are added, the result is always correct. If there is a carry-out 
from the sign-bit position, it is simply ignored. Therefore, the addition process is the same, 
regardless of the signs of the operands. It can be performed by an adder circuit, such as 
the one shown in Figure 3.5. Hence the 2’s complement notation is highly suitable for 
the implementation of addition operations. We will now consider its use in subtraction 
operations. 


(+ 5) 0101 (-5) 1011 

+ (+2) +0010 + (+2) +0010 

(+7) O111 (-3) 1101 

(+ 5) 0101 (-5) 1011 

+ (-2) +1110 + (-2) +1110 

(+ 3) 10011 (-7) 11001 
ignore ignore 


Figure 3.9 — Examples of 2’s complement addition. 
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2’s Complement Subtraction 


The easiest way of performing subtraction is to negate the subtrahend and add it to 
the minuend. This is done by finding the 2’s complement of the subtrahend and then 
performing the addition. Figure 3.10 illustrates the process. The operation 5 — (+2) = 3 
involves finding the 2’s complement of +2, which is 1110. When this number is added to 
0101, the result is 0011 = (+3) and a carry-out from the sign-bit position occurs, which is 
ignored. A similar situation arises for (—5) — (+2) = (—7). In the remaining two cases 
there is no carry-out, and the result is correct. 

As a graphical aid to visualize the addition and subtraction examples in Figures 3.9 and 
3.10, we can place all possible four-bit patterns on a modulo- 16 circle given in Figure 3.1 1a. 
If these bit patterns represented unsigned integers, they would be numbers 0 to 15. If they 
represent 2’s-complement integers, then the numbers range from —8 to +7, as shown. 
The addition operation is done by stepping in the clockwise direction by the magnitude of 
the number to be added. For example, —5 + 2 is determined by starting at 1011 (= —5) 
and moving clockwise two steps, giving the result 1101 (= —3). Figure 3.11b shows how 
subtraction can be performed on the modulo- 16 circle, using the example 5 — 2 = 3. Wecan 
start at 0101 (= +5) and move counterclockwise by two steps, which gives 0011 (= +3). 
But, we can also use the 2’s complement of 2 and add this value by stepping in the clockwise 


(+5) 0101 0101 
— (+2) - 0010 —> +1110 
(+ 3) 10011 
ignore 
(-5) 1011 1011 
-(42) -0010 cS #1110 
(-7) 11001 
ignore 
(+5) 0101 0101 
—(=2) —1110 == +0010 
(+7) O1lll 
(-5) 1011 LOU 7 
— (-2) -1110 —<P +0010 
(-3) 1101 
Figure 3.10 — Examples of 2’s complement subtraction. 
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(a) The number circle (b) Subtracting 2 by adding its 2’s complement 


Figure 3.11 Graphical interpretation of four-bit 2’s complement numbers. 


direction as shown in the figure. Since there are 16 numbers on the circle, the value we 
need to add is 16 — 2 = (14)i9 = (1110)2. 

The key conclusion of this section is that the subtraction operation can be realized as 
the addition operation, using a 2’s complement of the subtrahend, regardless of the signs of 
the two operands. Therefore, it should be possible to use the same adder circuit to perform 
both addition and subtraction. 


3.3.3 ADDER AND SUBTRACTOR UNIT 


The only difference between performing addition and subtraction is that for subtraction it 
is necessary to use the 2’s complement of one operand. Let X and Y be the two operands, 
such that Y serves as the subtrahend in subtraction. From Section 3.3.1 we know that a 
2’s complement can be obtained by adding | to the 1’s complement of Y. Adding 1 in the 
least-significant bit position can be accomplished simply by setting the carry-in bit co to 1. 
A 1’s complement of a number is obtained by complementing each of its bits. This could be 
done with NOT gates, but we need a more flexible circuit where we can use the true value 
of Y for addition and its complement for subtraction. 

In Section 3.2 we explained that two-input XOR gates can be used to choose between 
true and complemented versions of an input value, under the control of the other input. This 
idea can be applied in the design of the adder/subtractor unit as follows. Assume that there 
exists a control signal that chooses whether addition or subtraction is to be performed. Let 
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Yn-1 yy Yo 


Add /Sub 
control 


n-bit adder 


n-1 


Figure 3.12 = Adder/subtractor unit. 


this signal be called Add/Sub. Also, let its value be 0 for addition and 1 for subtraction. To 
indicate this fact, we placed a bar over Add. This is a commonly used convention, where 
a bar over a name means that the action specified by the name is to be taken if the control 
signal has the value 0. Now let each bit of Y be connected to one input of an XOR gate, with 
the other input connected to Add/Sub. The outputs of the XOR gates represent Y if Add/Sub 
= 0, and they represent the 1’s complement of Y if Add/Sub = 1. This leads to the circuit 
in Figure 3.12. The main part of the circuit is an n-bit adder, which can be implemented 
using the ripple-carry structure of Figure 3.5. Note that the control signal Add/Sub is also 
connected to the carry-in co. This makes co = 1 when subtraction is to be performed, thus 
adding the 1 that is needed to form the 2’s complement of Y. When the addition operation 
is performed, we will have cp = 0. 

The combined adder/subtractor unit is a good example of an important concept in the 
design of logic circuits. It is useful to design circuits to be as flexible as possible and to 
exploit common portions of circuits for as many tasks as possible. This approach minimizes 
the number of gates needed to implement such circuits, and it reduces the wiring complexity 
substantially. 


3.3.4 RADIX-COMPLEMENT SCHEMES* 


The 2’s complement scheme is just a special case of radix-complement schemes which we 
discuss in this section. This general discussion can be skipped without loss of continuity in 
the context of computer technology. 
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The idea of performing a subtraction operation by addition of a complement of the 
subtrahend is not restricted to binary numbers. We can gain some insight into the workings 
of the 2’s complement scheme by considering its counterpart in the decimal number system. 
Consider the subtraction of two-digit decimal numbers. Computing a result such as 74 — 
33 = 41 is simple because each digit of the subtrahend is smaller than the corresponding 
digit of the minuend; therefore, no borrow is needed in the computation. But computing 
74 — 36 = 38 is not as simple because a borrow is needed in subtracting the least-significant 
digit. If a borrow occurs, the computation becomes more complicated. 

Suppose that we restructure the required computation as follows 


74 — 36 = 74+ 100 — 100 — 36 
= 74+ (100 — 36) — 100 


Now two subtractions are needed. Subtracting 36 from 100 still involves borrows. But 
noting that 100 = 99 + 1, these borrows can be avoided by writing 


74 — 36 = 74+ (99 + 1 — 36) — 100 
= 74+ (99 — 36) + 1 — 100 


The subtraction in parentheses does not require borrows; it is performed by subtracting each 
digit of the subtrahend from 9. We can see a direct correlation between this expression and 
the one used for 2’s complement, as reflected in the circuit in Figure 3.12. The operation 
(99 — 36) is analogous to complementing the subtrahend Y to find its 1’s complement, 
which is the same as subtracting each bit from 1. Using decimal numbers, we find the 9's 
complement of the subtrahend by subtracting each digit from 9. In Figure 3.12 we add 
the carry-in of 1 to form the 2’s complement of Y. In our decimal example we perform 
(99 — 36) + 1 = 64. Here 64 is the 10’s complement of 36. For an n-digit decimal number, 
N, its 10’s complement, Ko, is defined as Kjg = 10” — N, while its 9’s complement, Ko, is 
Ky = (10" — 1) —-N. 

Thus the required subtraction (74 — 36) can be performed by addition of the 10’s 
complement of the subtrahend, as in 


74 — 36 = 74+ 64 — 100 
= 138 — 100 
= 38 
The subtraction 138 — 100 is trivial because it means that the leading digit in 138 is simply 


deleted. This is analogous to ignoring the carry-out from the circuit in Figure 3.12, as 
discussed for the subtraction examples in Figure 3.10. 


Example 3.1 


Suppose that A and B are n-digit decimal numbers. Using the above 10’s-complement 
approach, B can be subtracted from A as follows: 


A=-RSAL 10 =) = 10" 


If A > B, then the operation A + (10” — B) produces a carry-out of 1. This carry is equiva- 
lent to 10”; hence it can be simply ignored. 
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ButifA < B, then the operation A + (10" — B) produces a carry-out of 0. Let the result 
obtained be M, so that 


A—B=M — 10" 
We can rewrite this as 
10° (B-—A)=M 


The left side of this equation is the 10’s complement of (B — A). The 10’s complement of 
a positive number represents a negative number that has the same magnitude. Hence M 
correctly represents the negative value obtained from the computation A — B when A < B. 
This concept is illustrated in the examples that follow. 
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When dealing with binary signed numbers we use 0 in the left-most bit position to denote 
a positive number and | to denote a negative number. If we wanted to build hardware that 
operates on signed decimal numbers, we could use a similar approach. Let 0 in the left-most 
digit position denote a positive number and let 9 denote a negative number. Note that 9 is 
the 9’s complement of 0 in the decimal system, just as 1 is the 1’s complement of 0 in the 
binary system. 

Thus, using three-digit signed numbers, A = 045 and B = 027 are positive numbers 
with magnitudes 45 and 27, respectively. The number B can be subtracted from A as follows 


A — B= 045 — 027 
= 045 + 1000 — 1000 — 027 
= 045 + (999 — 027) + 1 — 1000 
= 045 + 972 + 1 — 1000 
= 1018 — 1000 
= 018 
This gives the correct answer of +18. 
Next consider the case where the minuend has lower value than the subtrahend. This 
is illustrated by the computation 
B—A=027—- 045 
= 027+ 1000 — 1000 — 045 
= 027 + (999 — 045) + 1 — 1000 
= 027 + 954 + 1 — 1000 
= 982 — 1000 
From this expression it appears that we still need to perform the subtraction 982 — 1000. 
But as seen in Example 3.1, this can be rewritten as 
982 = 1000+ B-—A 
= 1000 — (A— B) 


Example 3.2 
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Therefore, 982 is the negative number that results when forming the 10’s complement of 
(A — B). From the previous computation we know that (A — B) = 018, which denotes 
+18. Thus the signed number 982 is the 10’s complement representation of —18, which is 
the required result. 


These examples illustrate that signed numbers can be subtracted without using a sub- 
traction operation that involves borrows. The only subtraction needed is in forming the 
9’s complement of the subtrahend, in which case each digit is simply subtracted from 9. 
Thus a circuit that forms the 9’s complement, combined with a normal adder circuit, will 
suffice for both addition and subtraction of decimal signed numbers. A key point is that the 
hardware needs to deal only with n digits if n-digit numbers are used. Any carry that may 
be generated from the left-most digit position is simply ignored. 

The concept of subtracting a number by adding its radix-complement is general. If 
the radix is r, then the r’s complement, K,, of an n-digit number, NV, is determined as 
K, =r" —N. The (r — 1)’s complement, K,_1, is defined as K,_; = (r” — 1) — N;; it is 
computed simply by subtracting each digit of N from the value (r— 1). The (r— 1)’s 
complement is referred to as the diminished-radix complement. Circuits for forming the 
(r — 1)’s complements are simpler than those for general subtraction that involves borrows. 
The circuits are particularly simple in the binary case, where the 1’s complement requires 
just inverting each bit. 


Example 3.3 


In Figure 3.10 we illustrated the subtraction operation on binary numbers given in 2’s- 
complement representation. Consider the computation (+5) — (+2) = (+3), using the 
approach discussed above. Each number is represented by a four-bit pattern. The value 2+ 
is represented as 10000. Then 
0101 — 0010 = 0101 + (10000 — 0010) — 10000 

= 0101 + (1111 — 0010) + 1 — 10000 

= 0101 + 1101 + 1 — 10000 

= 10011 — 10000 

= 0011 


Because 5 > 2, there is a carry from the fourth bit position. It represents the value 24, 
denoted by the pattern 10000. 


Example 3.4 


Consider now the computation (+2) — (+5) = (—3), which gives 


0010 — 0101 = 0010 + (10000 — 0101) — 10000 
= 0010 + (1111 — 0101) + 1 — 10000 
= 0010 + 1010 + 1 — 10000 
= 1101 — 10000 


3.3. SIGNED NUMBERS 


Because 2 < 5, there is no carry from the fourth bit position. The answer, 1101, is the 
2’s-complement representation of —3. Note that 


1101 = 10000 + 0010 — 0101 
= 10000 — (0101 — 0010) 
= 10000 — 0011 


indicating that 1101 is the 2’s complement of 0011 (+3). 
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Finally, consider the case where the subtrahend is a negative number. The computation 
(+5) — (—2) = (47) is done as follows 


0101 — 1110 = 0101 + (10000 — 1110) — 10000 
= 0101 + (1111 — 1110) + 1 — 10000 
= 0101 + 0001 + 1 — 10000 
= 0111 — 10000 


While 5 > (—2), the pattern 1110 is greater than the pattern 0101 when the patterns are 
treated as unsigned numbers. Therefore, there is no carry from the fourth bit position. The 
answer 0111 is the 2’s complement representation of +7. Note that 


0111 = 10000 + 0101 — 1110 
= 10000 — (1110 — 0101) 
= 10000 — 1001 


and 1001 represents —7. 


3.3.5 ARITHMETIC OVERFLOW 


The result of addition or subtraction is supposed to fit within the significant bits used to 
represent the numbers. If 7 bits are used to represent signed numbers, then the result must 
be in the range —2”~! to 2”! — 1. If the result does not fit in this range, then we say that 
arithmetic overflow has occurred. To ensure the correct operation of an arithmetic circuit, 
it is important to be able to detect the occurrence of overflow. 

Figure 3.13 presents the four cases where 2’s-complement numbers with magnitudes 
of 7 and 2 are added. Because we are using four-bit numbers, there are three significant bits, 
by». When the numbers have opposite signs, there is no overflow. But if both numbers 
have the same sign, the magnitude of the result is 9, which cannot be represented with just 
three significant bits; therefore, overflow occurs. The key to determining whether overflow 
occurs is the carry-out from the MSB position, called c3 in the figure, and from the sign-bit 
position, called cy. The figure indicates that overflow occurs when these carry-outs have 
different values, and a correct sum is produced when they have the same value. Indeed, this 
is true in general for both addition and subtraction of 2’s-complement numbers. As a quick 
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(+7) 0111 (-7) 1001 
+ (+2) +0010 + (+2) +0010 
(+ 9) 1001 (-5) 1011 
C4 = cy = 0 
c, = 1 cz, = 0 
(+7) 0111 (-7) 1001 
+ (-2) +1110 + (-2) +1110 
(+5) 10101 (-9) 10111 
C4 1 cy = 1 
ec, = 1 C3 = 


Figure 3.13 = Examples for determination of overflow. 


check of this statement, consider the examples in Figure 3.9 where the numbers are small 
enough so that overflow does not occur in any case. In the top two examples in the figure, 
there is a carry-out of 0 from both sign and MSB positions. In the bottom two examples, 
there is a carry-out of | from both positions. Therefore, for the examples in Figures 3.9 and 
3.13, the occurrence of overflow is detected by 


Overflow = c3¢4 + €3c4 


=C30c4 
For n-bit numbers we have 
Overflow = Cy_1 ® Cn 


Thus the circuit in Figure 3.12 can be modified to include overflow checking with the 
addition of one XOR gate. 

An alternative and more intuitive way of detecting the arithmetic overflow is to ob- 
serve that overflow occurs if both summands have the same sign but the resulting sum has 
a different sign. Let X = x3x2x;xo and Y = y3y2y, yo represent four-bit 2’s-complement 
numbers, and let § = 53525,59 be the sum S = X + Y. Then 


Overflow = x39353 + 39353 


The carry-out and overflow signals indicate whether the result of a given addition is too 
large to fit into the number of bits available to represent the sum. The carry-out is meaningful 
only when unsigned numbers are involved, while the overflow is meaningful only in the 
case of signed numbers. In a typical computer, it is prudent to use the same adder circuits 
for dealing with both unsigned and signed operands, thus reducing the amount of circuitry 
required. This means that both the carry-out and overflow signals should be generated, as 
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we have discussed. Then, a program instruction that specifies unsigned operands can use 
the carry-out signal, while an instruction that has signed operands can use the overflow 
signal. 


3.3.6 PERFORMANCE ISSUES 


When buying a digital system, such as a computer, the buyer pays particular attention to 
the performance that the system is expected to provide and to the cost of acquiring the 
system. Superior performance usually comes at a higher cost. However, a large increase in 
performance can often be achieved at a modest increase in cost. Acommonly used indicator 
of the value of a system is its price/performance ratio. 

The addition and subtraction of numbers are fundamental operations that are performed 
frequently in the course of a computation. The speed with which these operations are 
performed has a strong impact on the overall performance of a computer. In light of this, 
let us take a closer look at the speed of the adder/subtractor unit in Figure 3.12. We are 
interested in the largest delay from the time the operands X and Y are presented as inputs, 
until the time all bits of the sum S and the final carry-out, c,, are valid. Most of this delay 
is caused by the n-bit adder circuit. Assume that the adder is implemented using the ripple- 
carry structure in Figure 3.5 and that each full-adder stage is the circuit in Figure 3.3c. The 
delay for the carry-out signal in this circuit, Ar, is equal to two gate delays. From Section 
3.2.2 we know that the final result of the addition will be valid after a delay of nAt, which 
is equal to 2n gate delays. In addition to the delay in the ripple-carry path, there is also a 
delay in the XOR gates that feed either the true or complemented value of Y to the adder 
inputs. If this delay is equal to one gate delay, then the total delay of the circuit in Figure 
3.12 is 2n + | gate delays. For a large n, say n = 32 or n = 64, the delay would lead to 
unacceptably poor performance. Therefore, it is important to find faster circuits to perform 
addition. 

The speed of any circuit is limited by the longest delay along the paths through the 
circuit. In the case of the circuit in Figure 3.12, the longest delay is along the path from 
the y,; input, through the XOR gate and through the carry circuit of each adder stage. The 
longest delay is often referred to as the critical-path delay, and the path that causes this 
delay is called the critical path. 
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The performance of a large digital system is dependent on the speed of circuits that form 
its various functional units. Obviously, better performance can be achieved using faster 
circuits. This can be accomplished by using superior (usually newer) technology in which 
the delays in basic gates are reduced. But it can also be accomplished by changing the overall 
structure of a functional unit, which may lead to even more impressive improvement. In 
this section we will discuss an alternative for implementation of an n-bit adder, which 
substantially reduces the time needed to add numbers. 
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3.4.1 CARRY-LOOKAHEAD ADDER 


To reduce the delay caused by the effect of carry propagation through the ripple-carry adder, 
we can attempt to evaluate quickly for each stage whether the carry-in from the previous 
stage will have a value 0 or 1. If a correct evaluation can be made in a relatively short time, 
then the performance of the complete adder will be improved. 

From Figure 3.3b the carry-out function for stage i can be realized as 


Ci-1 = Mii HF XICiT Vili 
If we factor this expression as 
Ci$1 = XiVi + OG + Yici 


then it can be written as 


Cin = Bi + DiC [3.3] 
where 
8i = Xiyi 
Di=XTYi 


The function g; is equal to 1 when both inputs x; and y; are equal to 1, regardless of the value 
of the incoming carry to this stage, c;. Since in this case stage i is guaranteed to generate 
a carry-out, g is called the generate function. The function p; is equal to 1 when at least 
one of the inputs x; and y; is equal to 1. In this case a carry-out is produced if c; = 1. The 
effect is that the carry-in of 1 is propagated through stage i; hence p; is called the propagate 
function. 

Expanding the expression 3.3 in terms of stage i — 1 gives 


Cint = Bi + PilBi-1 + Pi-1Ci-1) 
= 8 + PiSi-1 + PiPi-1Ci-1 


The same expansion for other stages, ending with stage 0, gives 


Cit = 81 + Pi8i-1 + PiPi-18i-2 + ++ + PiPi-1°+* P2P180 + PiPi-1-**Pipoco [3.4] 


This expression represents a two-level AND-OR circuit in which cj; is evaluated very 
quickly. An adder based on this expression is called a carry-lookahead adder. 

To appreciate the physical meaning of expression 3.4, it is instructive to consider its 
effect on the construction of a fast adder in comparison with the details of the ripple- 
carry adder. We will do so by examining the detailed structure of the two stages that add 
the least-significant bits, namely, stages 0 and 1. Figure 3.14 shows the first two stages 
of a ripple-carry adder in which the carry-out functions are implemented as indicated in 
expression 3.3. Each stage is essentially the circuit from Figure 3.3c except that an extra 
OR gate is used (which produces the p; signal), instead of an AND gate because we factored 
the sum-of-products expression for c;+1. 
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Figure 3.14 = Aripple-carry adder based on expression 3.3. 


The slow speed of the ripple-carry adder is caused by the long path along which a carry 
signal must propagate. In Figure 3.14 the critical path is from inputs x9 and yo to the output 
C2. It passes through five gates, as highlighted in blue. The path in other stages of an n-bit 
adder is the same as in stage 1. Therefore, the total number of gate delays along the critical 
path is 2n + 1. 

Figure 3.15 gives the first two stages of the carry-lookahead adder, using expression 
3.4 to implement the carry-out functions. Thus 


C1 = 80 + poco 
C2 = 81 + P180 + PiPoco 


This circuit does not have the long ripple-carry path that is present in Figure 3.14. Instead, 
all carry signals are produced after three gate delays: one gate delay is needed to produce 
the generate and propagate signals go, 91, po, and p;, and two more gate delays are needed 
to produce c, and c, concurrently. Extending the circuit to n bits, the final carry-out signal 
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Figure 3.15 The first two stages of a carry-lookahead adder. 


Cn would also be produced after only three gate delays because expression 3.4 is just a large 
two-level (AND-OR) circuit. 

The total delay in the n-bit carry-lookahead adder is four gate delays. The values of 
all g; and p; signals are determined after one gate delay. It takes two more gate delays to 
evaluate all carry signals. Finally, it takes one more gate delay (XOR) to generate all sum 
bits. The key to the good performance of the adder is quick evaluation of carry signals. 

The complexity of an n-bit carry-lookahead adder increases rapidly as n becomes larger. 
To reduce the complexity, we can use a hierarchical approach in designing large adders. 
Suppose that we want to design a 32-bit adder. We can divide this adder into 4 eight-bit 
blocks, such that block 0 adds bits 7... 0, block 1 adds bits 15 ... 8, block 2 adds bits 
23 ... 16, and block 3 adds bits 31 ... 24. Then we can implement each block as an 
eight-bit carry-lookahead adder. The carry-out signals from the four blocks are cg, C16, C24, 
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Figure 3.16 = A hierarchical carry-lookahead adder with ripple-carry between blocks. 


and c32. Now we have two possibilities. We can connect the four blocks as four stages in a 
ripple-carry adder. Thus while carry-lookahead is used within each block, the carries ripple 
between the blocks. This circuit is illustrated in Figure 3.16. 

Instead of using a ripple-carry approach between blocks, a faster circuit can be designed 
in which a second-level carry-lookahead is performed to produce quickly the carry signals 
between blocks. The structure of this “hierarchical carry-lookahead adder” is shown in 
Figure 3.17. Each block in the top row includes an eight-bit carry-lookahead adder, based 
on generate and propagate signals for each stage in the block, as discussed before. However, 
instead of producing a carry-out signal from the most-significant bit of the block, each block 
produces generate and propagate signals for the entire block. Let G; and P; denote these 
signals for each block j7. Now G; and P; can be used as inputs to a second-level carry- 
lookahead circuit at the bottom of Figure 3.17, which evaluates all carries between blocks. 
We can derive the block generate and propagate signals for block 0 by examining the 
expression for cg 


Cg = 87 + P7186 + P7P685 + P7P6P584 + P7P6PSP483 + P7P6PsP4P382 
+ P71P6P5P4P3P281 + P7IP6PSP4P3P2P 180 + P71P6PSP4P3P2P1P0Co 


The last term in this expression specifies that, if all eight propagate functions are 1, then 
the carry-in co is propagated through the entire block. Hence 


Po = P7P6Ps5P4P3P2P1P0 


The rest of the terms in the expression for cg represent all other cases when the block 
produces a carry-out. Thus 


Go = 87 + P7186 + P7P68s + ° + + P71P6PsP4P3P2P 180 
The expression for cg in the hierarchical adder is given by 
Cg = Go + Poco 


For block | the expressions for G; and P; have the same form as for Go and Po except that 
each subscript i is replaced by i + 8. The expressions for G2, P, G3, and P3 are derived in 
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Figure 3.17 — A hierarchical carry-lookahead adder. 


the same way. The expression for the carry-out of block 1, cj, is 


C16 = G1 + Picg 
= G, + P}Go + Pi Poco 


Similarly, the expressions for c24 and c32 are 


Cog = G2 + P2Gi + P2P1Go + P2P1 Poco 
€32 = G3 + P3Gy + P3P2G, + P3P2P1Go + P3P2P\Poco 


Using this scheme, it takes two more gate delays to produce the carry signals cg, C16, C24, 
and c3z than the time needed to generate the G; and P; functions. Therefore, since G; and P; 
require three gate delays, cg, C16, C24, and c32 are available after five gate delays. The time 
needed to add two 32-bit numbers involves these five gate delays plus two more to produce 
the internal carries in blocks 1, 2, and 3, plus one more gate delay (XOR) to generate each 
sum bit. This gives a total of eight gate delays. 

In Section 3.3.6 we determined that it takes 2n + 1 gate delays to add two numbers 
using a ripple-carry adder. For 32-bit numbers this implies 65 gate delays. It is clear that 
the carry-lookahead adder offers a large performance improvement. The trade-off is much 
greater complexity of the required circuit. 
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Technology Considerations 

The preceding delay analysis assumes that gates with any number of inputs can be 
used. But, the number of gate inputs, referred to as the fan-in of the gate, has to be limited 
in practice as we discuss in Appendix B. Therefore the reality of fan-in constraints must be 
taken into account. To illustrate this problem, consider the expressions for the first eight 
carries: 


C1 = 80 + poco 
Co = 81 + Pi80 + PiPoco 


Cs = 87 + P786 + P7P685 + P7P6P584 + P7IP6P5P483 + P7P6P5P4P382 
+ P1P6P5P4P3P281 + P7IP6PSP4P3P2P 180 + P7P6P5P4P3P2P 1P0C0 


Suppose that the maximum fan-in of the gates is four inputs. Then it is impossible to 
implement all of these expressions with a two-level AND-OR circuit. The biggest problem 
is cg, where one of the AND gates requires nine inputs; moreover, the OR gate also requires 
nine inputs. To meet the fan-in constraint, we can rewrite the expression for cg as 


Cg = (87 + P786 + P7P685 + PIP6P584) + U(P7P6PsP4) (83 + P32 + P3P281 + P3P2P 180) | 
+ (P71P6PsP4)(P3P2P 1P0)Co 


To implement this expression we need ten AND gates and three OR gates. The propagation 
delay in generating cg consists of one gate delay to develop all g; and p;, two gate delays 
to produce the sum-of-products terms in parentheses, one gate delay to form the product 
term in square brackets, and one delay for the final ORing of terms. Hence cg is valid after 
five gate delays, rather than the three gate delays that would be needed without the fan-in 
constraint. 

Because fan-in limitations reduce the speed of the carry-lookahead adder, some devices 
that are characterized by low fan-in include dedicated circuitry for implementation of fast 
adders. Examples of such devices include FPGAs, which are described in Appendix B. 


151 


3.5 DESIGN OF ARITHMETIC CIRCUITS UsING CAD TooLs 


In this section we show how the arithmetic circuits can be designed by using CAD tools. 


3.5.1 DESIGN OF ARITHMETIC CIRCUITS USING SCHEMATIC 
CAPTURE 


An obvious way to design an arithmetic circuit via schematic capture is to draw a schematic 
that contains the necessary logic gates. For example, to create an n-bit adder, we could first 
draw a schematic that represents a full-adder. Then an n-bit ripple-carry adder could be 


152 


CHAPTER 3 e NUMBER REPRESENTATION AND ARITHMETIC CIRCUITS 


created by drawing a higher-level schematic that connects together n instances of the full- 
adder. A hierarchical schematic created in this manner would look like the circuit shown in 
Figure 3.5. We could also use this methodology to create an adder/subtractor circuit, such 
as the circuit depicted in Figure 3.12. 

The main problem with this approach is that it is cumbersome, especially when the 
number of bits is large. This issue is even more apparent if we consider creating a schematic 
for a carry-lookahead adder. As shown in Section 3.4.1, the carry circuitry in each stage 
of the carry-lookahead adder becomes increasingly more complex. Hence it is necessary 
to draw a separate schematic for each stage of the adder. A better approach for creating 
arithmetic circuits via schematic capture is to use predefined subcircuits. 

The schematic capture tools provide a library of graphical symbols that represent basic 
logic gates. These gates are used to create schematics of relatively simple circuits. In 
addition to basic gates, most schematic capture tools also provide a library of commonly- 
used circuits, such as adders. Each circuit is provided as a module that can be imported 
into a schematic and used as part of a larger circuit. 

Different vendors of CAD tools have their specific approaches to how schematic capture 
is performed. Instead of dealing with the schematic capture approach, we will concentrate 
on the more convenient and flexible approach of using Verilog to design our circuits. 


3.5.2 DESIGN OF ARITHMETIC CIRCUITS USING VERILOG 


We said in Section 3.5.1 that an obvious way to create an n-bit adder is to draw a hierarchical 
schematic that contains n full-adders. This approach can also be followed by using Verilog, 
by first creating a Verilog module for a full-adder and then defining a higher-level module 
that uses n instances of the full-adder. As a first attempt at designing arithmetic circuits 
by using Verilog, we will show how to write the hierarchical code for a ripple-carry adder. 
Hierarchical Verilog code was introduced in Chapter 2. 

Suppose that we wish to implement the full-adder circuit given in Figure 3.3c, which 
has the inputs Cin, x, and y, and produces the outputs s and Cout. One way of specifying 
this circuit in Verilog is to use the gate-level primitives as shown in Figure 3.18. Each of the 
three AND gates in the circuit is defined by a separate statement. Verilog allows combining 
such statements into a single statement as shown in Figure 3.19. In this case, commas are 
used to separate the definition of each AND gate. 

Another possibility is to use functional expressions as indicated in Figure 3.20. The 
XOR operation is denoted by the” sign. Again, it is possible to combine the two continuous 
assignment statements into a single statement as shown in Figure 3.21. 

Both of the above approaches result in the same full-adder circuit being synthesized. 
We can now create a separate Verilog module for the ripple-carry adder, which instantiates 
the fulladd module as a subcircuit. One method of doing this is shown in Figure 3.22. The 
module comprises the code for a four-bit ripple-carry adder, named adder4. One of the 
four-bit numbers to be added is represented by the four signals x3, x2, x;, xo, and the other 
number is represented by y3, y2, ¥1, yo. The sum is represented by 53, 52, 51, So. The circuit 
incorporates a carry input, carryin, into the least-significant bit position and a carry output, 
carryout, from the most-significant bit position. 


3.5 DESIGN OF ARITHMETIC CIRCUITS USING CAD TooLs 


module fulladd (Cin, x, y, s, Cout); 
input Cin, x, y; 
output s, Cout; 


xor (s, xX, y, Cin); 
and (z1, x, y); 

and (z2, x, Cin); 
and (z3, y, Cin); 

or (Cout, zl, z2, 23); 


endmodule 


Figure 3.18 Verilog code for the full-adder using gate-level 
primitives. 


module fulladd (Cin, x, y, s, Cout); 
input Cin, x, y; 
output s, Cout; 


xor (s, xX, y, Cin); 
and (zl, x, y), 
(z2, x, Cin), 
(z3, y, Cin); 
or (Cout, zl, z2, 23); 


endmodule 


Figure 3.19 — Another version of Verilog code from 
Figure 3.18. 


The four-bit adder in Figure 3.22 is described using four instantiation statements. Each 
statement begins with the name of the module, fulladd, that is being instantiated, followed 
by an instance name. The instance names must be unique. The least-significant stage in 
the adder is named stage0 and the most-significant stage is stage3. The signal names in the 
adder4 module that are to be connected to each input and output port on the fulladd module 
are then listed. These signals are listed in the same order as in the fulladd module, namely 
the order Cin, x, y, s, Cout. 

As discussed in Section 2.10, the signal names associated with each instance of the 
fulladd module implicitly specify how the full-adders are connected together. For example, 
the carry-out of the stage0 instance is connected to the carry-in of the stage! instance. The 
synthesized circuit has the same structure as the one shown in Figure 3.5. The fulladd 
module may be included in the same Verilog source code file as the adder4 module, as we 
have done in Figure 3.22, but it may also comprise a separate file. In the latter case, the 
location of the file fulladd has to be indicated to the compiler. 
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module fulladd (Cin, x, y, s, Cout); 
input Cin, x, y; 
output s, Cout; 


assign s=x% y% Cin; 


assign Cout = (x & y) | (x & Cin) | (y & Cin); 


endmodule 


Figure 3.20 _ Verilog code for the full-adder using continuous 


assignment. 


module fulladd (Cin, x, y, s, Cout); 
input Cin, x, y; 
output s, Cout; 


assign s=x% y% Cin, 


Cout = (x & y)| (x & Cin) | (y & Cin); 


endmodule 


Figure 3.21 = Another version of Verilog code from Figure 3.20. 


module adder4 (carryin, x3, x2, x1, x0, y3, y2, yl, yO, s3, s2, s1, sO, carryout) 


input carryin, x3, x2, x1, x0, y3, y2, yl, yO; 
output s3, s2, s1, sO, carryout; 


fulladd stageO (carryin, x0, yO, sO, cl); 
fulladd stagel (cl, x1, yl, sl, c2); 
fulladd stage2 (c2, x2, y2, s2, c3); 
fulladd stage3 (c3, x3, y3, s3, carryout); 
endmodule 
module fulladd (Cin, x, y, s, Cout); 
input Cin, x, y; 


output s, Cout; 


assign s =x “y “ Cin; 
assign Cout = (x & y) | (x & Cin) | (y & Cin); 


endmodule 


Figure 3.22 Verilog code for a four-bit adder. 
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3.5.3 USING VECTORED SIGNALS 


In Figure 3.22 each of the four-bit inputs and the four-bit output of the adder is represented 
using single-bit signals. A more convenient approach is to use multibit signals, called 
vectors, to represent the numbers. Just as a number is represented in a logic circuit as 
signals on multiple wires, it can be represented in Verilog code as a multibit vector. An 
example of an input vector is 


input [3:0] X; 


This statement defines X to be a four-bit vector. Its individual bits can be referred to by 
using an index value in square brackets. Thus, the most-significant bit (MSB) is referred 
to as X [3] and the least-significant bit (LSB) is X [0]. A two-bit vector that consists of the 
two middle bits of X is denoted as X [2:1]. The symbol X refers to the entire vector. 

Using vectors we can specify the four-bit adder as shown in Figure 3.23. In addition 
to the input vectors X and Y, and output vector S, we chose to define the carry signals 
between the full-adder stages as a three-bit vector C[3:1]. Note that the carry into stageO 
is still called carryin, while the carry from stage3 is called carryout. The internal carry 
signals are defined in the statement 

wire [3:1] C; 
In Figure 3.23, signal C[1] is used to connect the carry output of the full-adder in stage 0 
to the carry input of the full-adder in stage 1. Similarly, C[2] and C[3] are used to connect 
the other stages of the adder. 

The vector specification gives the bit width in square brackets, as in X [3:0]. The bit 
width is specified using the index of the MSB first and the LSB last. Hence, X [3] is the 
MSB and X [0] is the LSB. A reverse ordering can also be used. For example, Z[0:3] defines 
a four-bit vector in which Z[0] is its MSB and Z[3] is its LSB. The terminology MSB and 
LSB is natural when vectors are used to represent numbers. In other cases, the bit-select 
index merely identifies a particular bit in a vector. 


module adder4 (carryin, X, Y, S, carryout); 
input carryin; 
input [3:0] X, Y; 
output [3:0] S; 
output carryout; 
wire [3:1] C; 


fulladd stageO (carryin, X[0], Y[0], S[O], C[1]); 
fulladd stagel (C[1], X[1], Y(1], S{1], C[2)); 
fulladd stage2 (C[2], X[2], Y[2], S[2], C[3]); 
fulladd stage3 (C[3], X[3], Y[3], S[3], carryout); 


endmodule 


Figure 3.23 A four-bit adder using vectors. 
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3.5.4 USING A GENERIC SPECIFICATION 


The approach in designing a ripple-carry adder presented in Figure 3.23 is rather restrictive 
because the resulting circuit is of a predetermined size of four bits. A similar adder that 
could add 32-bit numbers would require Verilog code with 32 instances of the full-adder 
subcircuit defined in separate statements. From the designer’s point of view, it is preferable 
to define a module that could be used to implement an adder of any size, where the size 
may be given as a parameter. 

Verilog allows the use of general parameters that can be given a specific value as 
desired. For example, an n-bit vector representing a number may be given as X [n—1:0]. If 
n is defined in the Verilog statement 


parameter n= 4; 


then the bit range of X is [3:0]. 
The ripple-carry adder in Figure 3.5 can be described using the logic expressions 


Sk = Xe Bye BC 
Chl = XKVk + XKCK A VECK 


fork = 0, 1,...,n— 1. Instead of instantiating full-adders as in Figure 3.23, these expres- 
sions can be used in Verilog to specify the desired adder. 

Figure 3.24 shows Verilog code that defines an n-bit adder. The inputs X and Y and 
the output sum S are declared to be n-bit vectors. To simplify the use of carry signals in 
the adder circuit, we defined a vector C that has n+ 1 bits. Bit C[0] is the carry into the 
LSB position, while C[n] is the carry from the MSB position. Hence C[0] = carryin and 
carryout = C[n] in terms of the n-bit adder. 

To specify the repetitive structure of the ripple-carry adder, Figure 3.24 introduces the 
Verilog for statement. Like the if-else statement introduced in Chapter 2, the for statement 
is a procedural statement that must be placed inside an always block, as shown in the 
figure. As explained in Chapter 2, any signal that is assigned a value by a statement within 
an always block must retain this value until it is again re-evaluated by changes in the 
sensitivity variables given in the always statement. Such signals are declared to be of reg 
type; they are carryout, S, and C signals in Figure 3.24. The sensitivity variables are X, Y, 
and carryin. 

In our example, the for loop consists of two statements delineated by begin and end. 
These statements define the sum and carry functions for the adder stage that corresponds 
to the value of the loop variable k. The range of k is from 0 to n — 1 and its value is 
incremented by | for each iteration of the loop. The syntax of the Verilog for statement 
is similar to the syntax of a for loop in the C programming language. However, the C 
operators ++ and —— do not exist in Verilog, hence incrementing or decrementing of the 
loop variable must be given as k =k + 1 ork =k — 1, rather than k++ or k——. Note 
that k is declared to be an integer and it is used to control the number of iterations of the for 
loop; it does not represent a physical connection in the circuit. The effect of the for loop is 
to repeat the statements inside the loop for each loop iteration. For instance, if k were set 
to 2 in this example, then the for loop would be equivalent to the four statements 
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module addern (carryin, X, Y, S, carryout); 
parameter n = 32; 
input carryin; 
input [n— 1:0] X, Y; 
output reg [n—1:0] S; 
output reg carryout; 
reg [n:0] C; 
integer k; 


always @(X, Y, carryin) 
begin 
C[0] = carryin; 
for (k =0;k < n; k=k+1) 
begin 
S[k] = X[k] “ Y[k] * C[k]; 
C[k+1] = (X[k] & Y[k]) | X[k] & C[k]) | (YTk] & C[k]); 
end 
carryout = C[n]; 
end 


endmodule 


Figure 3.24 = A generic specification of a ripple-carry adder. 


S[0] = X[0]* Y[0] * C(O}; 
C{1] = (X[O] & Y[O]) | (X[O] & C[O]) | CY[O] & C[O}); 
Sf] = X{1)* YO] * Cf]; 
C[2] = (X{1] & Yf1)) | CU] & CH) | YT) & CLL); 


Since the value of n is 32, as declared in the parameter statement, the code in the figure 
implements a 32-bit adder. 


Using the Generate Capability 

In Figure 3.23 we instantiated four copies of the fulladd subcircuit to specify a four-bit 
ripple-carry adder. This approach can be used to specify an n-bit adder by using a loop 
that instantiates the fulladd subcircuit n times. The Verilog generate construct provides 
the desired capability. It allows instantiation statements to be included inside for loops and 
if-else statements. If a for loop is included in the generate block, the loop index variable 
has to be declared of type genvar. A genvar variable is similar to an integer variable, but 
it can have only positive values and it can be used only inside generate blocks. 

Figure 3.25 shows how the addern module can be written to instantiate n fulladd 
modules. Each instance generated by the compiler in the for loop will have a unique 
instance name produced by the compiler based on the for loop label. The generated names 
are addbit[0].stage, ... , addbit.[n — 1].stage. This code produces the same result as the 
code in Figure 3.24. 
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module addern (carryin, X, Y, S, carryout); 
parameter n = 32; 
input carryin; 
input [n-1:0] X, Y; 
output [n—1:0] S; 
output carryout; 
wire [n:0] C; 


genvar i; 
assign C[O] = carryin; 
assign carryout = C[n]; 


generate 
for GG =0;i<=n-1;i=i+1) 
begin:addbit 
fulladd stage (C[i], X[i], YO], Si], C[i+1]); 
end 
endgenerate 


endmodule 


module fulladd (Cin, x, y, s, Cout); 
input Cin, x, y; 
output s, Cout; 


assign s=x “y “Cin; 
assign Cout = (x & y) | (k & Cin) | (y & Cin); 


endmodule 


Figure 3.25 = A rripple-carry adder specified by using the generate 
statement. 


3.5.5 NETS AND VARIABLES IN VERILOG 


A logic circuit is modeled in Verilog by a collection of interconnected logic elements and/or 
by procedural statements that describe its behavior. Connections between logic elements are 
defined using nets. Signals produced by procedural statements are referred to as variables. 


Nets 


A net represents a node in a circuit. Nets can be of different types. For synthesis 
purposes the only important nets are of wire type, which we used in Section 3.5.3. A wire 
connects an output of one logic element in a circuit to an input of another logic element. 


3.5 DESIGN OF ARITHMETIC CIRCUITS USING CAD TooLs 


It can be a scalar that represents a single connection or a vector that represents multiple 
connections. For example, in Figure 3.22, carry signals c3, c2, and c; are scalars that model 
the connections between the full-adder modules. The specific connections are defined by 
the way the full-adder modules are instantiated. In Figure 3.23, the same carry signals 
are defined as a three-bit vector C. Observe that in Figure 3.22 the carry signals are not 
explicitly declared to be of wire type. The reason is that nets do not have to be declared in 
the code because Verilog syntax assumes that all signals are nets by default. Of course, the 
code in the figure would also be correct if we include in it the declaration 


wire c3,c2,cl; 


In Figure 3.23 it is necessary to declare the existence of vector C; otherwise, the Verilog 
compiler would not be able to determine that C[3], C[2], and C[1] are the constituent signals 
of C. Since these signals are nets, the vector C is declared to be of wire type. 


Variables 


Verilog provides variables to allow a circuit to be described in terms of its behavior. 
A variable can be assigned a value in one Verilog statement, and it retains this value until 
it is overwritten by a subsequent assignment statement. There are two types of variables: 
reg and integer. As mentioned in Chapter 2, all signals that are assigned a value using 
procedural statements must be declared as variables by using the reg or integer keywords. 
The scalar carryout and the vectors S and C in Figure 3.24 are examples of the reg type. 
The loop variable k in the same figure illustrates the integer type. It serves as a loop index. 
Such variables are useful for describing a circuit’s behavior; they do not usually correspond 
directly to signals in the resulting circuit. 

Further discussion of nets and variables is given in Appendix A. 


3.5.6 ARITHMETIC ASSIGNMENT STATEMENTS 


Arithmetic operations are used so often that it is convenient to have them incorporated 
directly into a hardware description language. Verilog implements such operations using 
arithmetic assignment statements and vectors. If the following vectors are defined 


input [n—1:0] X, Y; 
output [n—1:0] S; 


then, the arithmetic assignment statement 
S=X+4Y; 


represents an n-bit adder. 

In addition to the + operator, which is used for addition, Verilog also provides other 
arithmetic operators. The Verilog operators are discussed fully in Chapter 4 and Appendix 
A. The complete Verilog code that includes the preceding statement is given in Figure 3.26. 
Since there is a single statement in the always block, it is not necessary to include the begin 
and end delimiters. 
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The code in Figure 3.26 defines a circuit that generates the n sum bits, but it does not 
include carry-out or overflow signals. As explained previously, the carry-out signal is useful 
when the numbers X, Y, and S are interpreted as being unsigned numbers. The carry-out is 
1 when the sum of unsigned numbers overflows n bits. But, if X, Y, and S are interpreted 
as being signed numbers, then the carry-out is not meaningful, and we have to generate the 
arithmetic overflow output as discussed in Section 3.3.5. One way in which these signals 
can be generated is given in Figure 3.27. 

The carry-out from the MSB position, n — 1, can be derived by observing that the 
carry-out must be | if both x,_; and y,_; are 1, or if either x,_; or y,—-1 is 1 and s,_, is 0. 


module addern (carryin, X, Y, S); 
parameter n = 32; 
input carryin; 
input [n—1:0] X, Y; 
output reg [n—1:0] S; 


always @(X, Y, carryin) 
S=X+4+Y+carryin; 


endmodule 


Figure 3.26 — Specification of an n-bit adder using arithmetic 
assignment. 


module addern (carryin, X, Y, S, carryout, overflow); 
parameter n = 32; 
input carryin; 
input [n- 1:0] X, Y; 
output reg [n—1:0] S; 
output reg carryout, overflow; 


always @(X, Y, carryin) 
begin 
S=X+Y+carryin; 
carryout = (X[n—1] & Y[n—1]) | (X[{n-1] &~ S[n—1]) | (Y[n-1] &~ S[n- 1)); 
overflow = (X[n- 1] & Y[n—-1] &~ S[n—1]) | (~X[n-1] & ~Y[n-1] & S[n—1]); 
end 


endmodule 


Figure 3.27) = Annz-bit adder with carry-out and overflow signals. 
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Thus, 


carryout = Xn—1Yn—1 + Xn Sn 1+ Yn 1Sp-1 


(Note that this is just a normal logic expression in which the + sign represents the OR 
operation.) The expression for arithmetic overflow was defined in Section 3.3.5 as c, ® 
Cn—-1- In our case, c, corresponds to carryout, but there is no direct way of accessing c,_1, 
which is the carry from bit-position n — 2. Instead, we can use the more intuitive expression 
derived in Section 3.3.5, so that 


overflow = Xp 1Yn 1Sn-1 + Xn Wn—15n 1 


Another way of including the carry-out and overflow signals is shown in Figure 3.28. 
The (n + 1)-bit vector named Sum is used. The extra bit, Sum[n], becomes the carry-out 
from bit-position nm — 1 in the adder. The statement used to assign the sum of X, Y, and 
carryin to the Sum signal uses an unusual syntax. The meaning of the terms in brackets, 
namely {1’b0, X} and {1’b0, Y}, is that a 0 is concatenated on the left of the n-bit vectors 
X and Y to create (n+1)-bit vectors. In Verilog the { , } operator is called the concatenate 
operator. If A is an m-bit vector and B is a k-bit vector, then {A, B} creates an (m + k)-bit 
vector comprising A as its most-significant m bits and B as its least-significant k bits. The 
notation |’b0 represents a one-bit binary number that has the value 0. The reason that the 
concatenate operator is used in Figure 3.28 is to cause Sum[n] to be equivalent to the carry 
from bit position n — 1. In effect, we created x, = y, = 0 so that 


Sum[n] = 0+ 04 cy-1 


module addern (carryin, X, Y, S, carryout, overflow); 
parameter n= 32; 
input carryin; 
input [n—1:0] X, Y; 
output reg [n—1:0] S; 
output reg carryout, overflow; 
reg [n:0] Sum; 


always @(X, Y, carryin) 


begin 
Sum = {1’b0, X} + {1’b0, Y} + carryin; 
S = Sum[n- 1:0]; 


carryout = Sum[n]; 
overflow = (X[n—1] & Y[n—1] & ~S[n—1]) | ~X[n-1] & ~Y[n- 1] & S[n— 1); 
end 


endmodule 


Figure 3.28 — An alternative specification of n-bit adder with carry-out and overflow 
signals. 
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This example is useful because it provides a simple introduction to the concept of concate- 
nation. But we could have written simply 


Sum = X + Y + carryin; 


Because Sum is an (n + 1)-bit vector, the summation will be performed as if X and Y were 
(n + 1)-bit vectors in which Os are padded on the left. 
Another detail to observe from the figure is the statement 


S = Sum[n—1:0]; 


This statement assigns the lower n bits of Sum to the output sum S. The next statement 
assigns the carry-out from the addition, Sum[n], to the output signal carryout. 

We show the code in Figures 3.27 and 3.28 to illustrate some features of Verilog in the 
context of adder design. In general, a given design task can be performed using different 
approaches, as we will see throughout the book. Let us attempt another specification of 
the n-bit adder. In Figure 3.28 we use an (n + 1)-bit vector, Sum, as an intermediate signal 
needed to produce the n bits of S and the carry-out from the adder stage n — 1. This requires 
two Verilog statements that extract the desired bits from Sum. We showed how concatenation 
can be used to pad a 0 to vectors X and Y, but pointed out that this is not necessary because 
a vector is automatically padded with Os if it is involved in an arithmetic operation that 
produces a result of greater bit size. We can use concatenation more effectively on the left 
side of the addition statement by concatenating carryout to the S vector so that 


{carryout, S} = X + Y + carryin; 


Then there is no need for the Sum signal and the Verilog code is simplified as indicated in 
Figure 3.29. Since both figures, 3.28 and 3.29, describe the same behavior of the adder, 
the Verilog compiler is likely to generate the same circuit for either figure. The code in 
Figure 3.29 is simpler and more elegant. 


module addern (carryin, X, Y, S, carryout, overflow); 
parameter n= 32; 
input carryin; 
input [n- 1:0] X, Y; 
output reg [n—1:0] S; 
output reg carryout, overflow; 


always @(X, Y, carryin) 
begin 

{carryout, S} = X + Y + carryin; 

overflow = (X[n- 1] & Y[n—-1] &~S[n—1]) | ~ X[n-1] &~ Y[n- 1] & S[n—1]); 
end 


endmodule 


Figure 3.29 — Simplified complete specification of n-bit adder. 
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module fulladd (Cin, x, y, s, Cout); 
input Cin, x, y; 
output reg s, Cout; 


always @(x, y, Cin) 
{Cout, s} =x + y + Cin; 


endmodule 


Figure 3.30 — Behavioral specification of a full-adder. 


Note that the same approach can be used to specify a full-adder circuit, as shown in 
Figure 3.30. Unlike the specifications in Figures 3.18 to 3.21, which define the structure of 
the full-adder in terms of basic logic operations, in this case the code describes its behavior 
and the Verilog compiler implements the suitable details using the target technology. 


3.5.7 MoDULE HIERARCHY IN VERILOG CODE 


In Section 2.10.3 we introduced hierarchical Verilog code, in which a Verilog module instan- 
tiates other Verilog modules as subcircuits. If an instantiated module includes parameters, 
then either the default value of a parameter can be used for each instance, or a new value 
of the parameter can be specified. 

Suppose that we wish to create a circuit consisting of two adders as follows. One adder 
generates a 16-bit sum S = A + B, while the other generates an 8-bit sum T = C + D. An 
overflow signal must be set to | if either of the adders generates an arithmetic overflow. One 
way of specifying the desired circuit is presented in Figure 3.31. The top-level module, 
adder_hier, instantiates two instances of the addern module in Figure 3.29. Since the 
default value of 1 is 32 for addern, the top-level module has to define the desired values of 
16 and 8. The statement 


defparam Uln=16 
sets the value of n to 16 for the addern instance called U1. Similarly, the statement 
defparam U2.n=8 


sets the value of n to 8 for the instance named U2. 

Note that in Figure 3.31 we added an extra bit to both S and T to hold the carry-out 
signals produced by the adders; these signals are useful if the resulting sums are interpreted 
as unsigned numbers. 

Another way in which the value of n can be specified in the instantiated modules is 
shown in Figure 3.32. Here, the value of n is defined by using the Verilog # operator, instead 
of using the defparam statement. The code in Figures 3.31 and 3.32 produces identical 
results. 


163 


164 CHAPTER 3 ° NUMBER REPRESENTATION AND ARITHMETIC CIRCUITS 


module adder_hier (A, B, C, D, S, T, overflow); 
input [15:0] A, B; 
input [7:0] C, D; 
output [16:0] S; 
output [8:0] T; 
output overflow; 


wire o1, 02; // used for the overflow signals 


addern U1 (1’b0, A, B, S[15:0], S[16], 01); 
defparam U1.n = 16; 

addern U2 (1’b0, C, D, T[7:0], T[8], 02); 
defparam U2.n = 8; 


assign overflow = ol | 02; 


endmodule 


Figure 3.31 = An example of setting parameter values in Verilog 
code. 


module adder_hier (A, B, C, D, S, T, overflow); 
input [15:0] A, B; 
input [7:0] C, D; 
output [16:0] S; 
output [8:0] T; 
output overflow; 


wire o1, 02; // used for the overflow signals 


addern #(16) U1 (1’b0, A, B, S[15:0], S[16], 01): 
addern #(8) U2 (1’b0, C, D, T[7:0], T[8], 02): 


assign overflow = ol | 02; 


endmodule 


Figure 3.32 — Using the Verilog # operator to set the values of 
parameters. 


An alternative version of the code in Figure 3.32 is given in Figure 3.33. Here, we 
have again used the # operator, but in this case we have explicitly included the name n 
by using the syntax #(.n(16)) and #(.n(8)). In the same way that the names of parameters 
can be shown explicitly, the code in Figure 3.33 illustrates how the port names ofa subcircuit 
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module adder_hier (A, B, C, D, S, T, overflow); 
input [15:0] A, B; 
input [7:0] C, D; 
output [16:0] S; 
output [8:0] T; 
output overflow; 


wire 01, 02; // used for the overflow signals 


addern #(.n(16)) U1 
( 
.carryin (1’b0), 
.X (A), 
-Y (B), 
»S (S[15:0)), 
.carryout (S[16]), 
overflow (01) 
); 
addern #(.n(8)) U2 
( 
.carryin (1’b0), 
XX (C), 
-Y (D), 
S (T[7:0]), 
.carryout (T[8]), 
overflow (02) 


); 
assign overflow = ol | 02; 


endmodule 


Figure 3.33 = An alternative version of the code in Figure 3.32. 


can be included explicitly. In Verilog jargon this style of code is referred to as named 
port connections. In this case, the signals can be listed in any order. If the names of port 
signals of the instantiated module are not shown explicitly, as in Figures 3.31 and 3.32, 
then the order in which signals are listed in an instantiation statement determines how the 
connections are made. This is referred to as ordered association of port connections. In 
this case, the instantiation statement must list the ports in the order in which they appear 
in the instantiated module. The drawback of the style of code in Figure 3.33 is that it is 
more verbose. But, it has the advantage of being more explicit and hence less susceptible 
to careless errors. 
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3.5.8 REPRESENTATION OF NUMBERS IN VERILOG CODE 


Numbers can be given as constants in Verilog code. They can be given as binary (b), octal 
(0), hexadecimal (h), or decimal (d) numbers. Their size can be either fixed or unspecified. 
For sized numbers the format is 


<size_in_bits>’<radix_identifier> <significant_digits> 


The size is a decimal number that gives the number of bits needed, the radix is identified 
using letters b, 0, h, or d, and the digits are given in the notation of the radix used. For 
example, the decimal number 2217 can be represented using 12 bits as follows 


12’b100010101001 
12704251 

12°h8A9 

12°d2217 


Unsized numbers are given without specifying the size. For example, the decimal number 
278 may be given as 


*p100010110 
0426 

*h116 

278 


For decimal numbers it is not necessary to give the radix identifier d. When an unsized 
number is used in an expression the Verilog compiler gives it a certain size, which is typically 
the same as the size of the other operand(s) in the expression. To improve readability, it is 
possible to use the underscore character. Instead of writing 12’b100010101001, it is easier 
to visualize the same number as 12’b1000_1010_1001. 

Negative numbers are represented by placing the minus sign in front. Thus, if —5 is 
specified as —4’b101, it will be interpreted as a four-bit 2’s-complement of 5, which is 
1011. 

The specified size may exceed the number of bits that are actually needed to represent a 
given number. In this case, the final representation is padded to the left to yield the required 
size. However, if there are more digits than can fit into the number of bits given as the size, 
the extra digits will be ignored. 

Numbers represented by vectors of different bit sizes can be used in arithmetic opera- 
tions. Suppose that A is an eight-bit vector and B is a four-bit vector. Then the statement 


S=A+B; 


will generate an eight-bit sum vector S. The result will be correct if B is a positive number. 
However, if B is a negative number expressed in 2’s complement representation, the result 
will be incorrect because Os will be padded on the left to make B an eight-bit vector for the 
purpose of the addition operation. The value of a positive number does not change if Os are 
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appended as the most-significant bits; the value of a negative number does not change if 
1s are appended as the most-significant bits. Such replication of the sign bit is called sign 
extension. Therefore, for correct operation it is necessary to use a sign-extended version of 
B, which can be accomplished with concatenation in the statement 


S=A-+ {4{B[3]}, B}; 


The notation 4{B[3]} denotes that the bit B[3] is replicated four times; it is equivalent to 
writing {B[3], B[3], B[3], B[3]}. This is referred to as the replication operator, which is 
discussed in Chapter 4. 
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Before we discuss the general issue of multiplication, we should note that a binary number, 
B, can be multiplied by 2 simply by adding a zero to the right of its least-significant bit. This 
effectively moves all bits of B to the left, and we say that B is shifted left by one bit position. 
Thus if B = b,_1b,-2---+ bj bo, then 2 x B = by_{by_-2--+b, bo0. (We have already used 
this fact in Section 3.2.3.) Similarly, a number is multiplied by 2* by shifting it left by k bit 
positions. This is true for both unsigned and signed numbers. 

We should also consider what happens if a binary number is shifted right by k bit 
positions. According to the positional number representation, this action divides the 
number by 2‘. For unsigned numbers the shifting amounts to adding k zeros to the 
left of the most-significant bit. For example, if B is an unsigned number, then B + 2 = 
Obn»-1bn-2-++b2b;. Note that bit bo is lost when shifting to the right. For signed num- 
bers it is necessary to preserve the sign. This is done by shifting the bits to the right 
and filling from the left with the value of the sign bit. Hence if B is a signed number, 
then B + 2 = by bn_by_2--+b2b,. For instance, if B = 011000 = (24)j0, then B + 2 = 
001100 = (12)19 and B + 4 = 000110 = (6)j0. Similarly, if B = 101000 = —(24)jo, then 
B~+2= 110100 = —(12),9 and B + 4 = 111010 = —(6),9. The reader should also ob- 
serve that the smaller the positive number, the more Os there are to the left of the first 1, 
while for a negative number there are more Is to the left of the first 0. 

Now we can turn our attention to the general task of multiplication. Two binary numbers 
can be multiplied using the same method as we use for decimal numbers. We will focus our 
discussion on multiplication of unsigned numbers. Figure 3.34a shows how multiplication 
is performed manually, using four-bit numbers. Each multiplier bit is examined from right 
to left. If a bit is equal to 1, an appropriately shifted version of the multiplicand is added 
to form a partial product. If the multiplier bit is equal to 0, then nothing is added. The 
sum of all shifted versions of the multiplicand is the desired product. Note that the product 
occupies eight bits. 


3.6.1 ARRAY MULTIPLIER FOR UNSIGNED NUMBERS 
Figure 3.34b indicates how multiplication may be performed by using multiple adders. In 


each step a four-bit adder is used to compute the new partial product. Note that as the 
computation progresses, the least-significant bits are not affected by subsequent additions; 
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Multiplicand M_~ (14) 1110 


Multiplier Q (11) xX 1011 


1110 
1110 
0000 
1110 


Product P (154) 10011010 


(a) Multiplication by hand 


Partial product 0 


Partial product 1 


Partial product 2 


Multiplicand M (14) 1110 
Multiplier Q (11) x 1011 
Partial product 0 1 8 
+1110 
Partial product 1 10101 
+ 0000 
Partial product 2 01010 
+1110] 
Product P (154) 10011010 


(b) Using multiple adders 


93 q2 q qo 


Msdq Mago. Mido Modo 
Md, M14) Mod 


PP1, PP1, PP1, PP1, PPI, 
+34. Md. M42 Mod 


PP2, PP2, PP2, PP2, PP2, 


+ M393 M243 143 M043 | 


Product P P7 Po 


Figure 3.34 — Multiplication of unsigned numbers. 


P3 P2 Py Po 


(c) Hardware implementation 


hence they can be passed directly to the final product, as indicated by blue arrows. Of 
course, these bits are a part of the partial products as well. 

The same scheme can be used to design a multiplier circuit. We will stay with four- 
bit numbers to keep the discussion simple. Let the multiplicand, multiplier, and product 
be denoted as M = m3m2m mo, Q = q3q24190, and P = p7p6PsP4P3P2P Po. Tespectively. 
Figure 3.34c shows the required operations. Partial product 0 is obtained by using the AND 
of go with each bit of M, which produces 0 if go = 0 and M if qo = 1. Thus 


PPO = m3qQo M2Go ™1Go Mogo 


Partial product 1, PP 1, is generated by adding PPO to a shifted version of M that is ANDed 
with g;. Similarly, partial product PP2 is generated using the AND of qg2 with M shifted 
by another bit position and adding to PP1, and so on. 


3.6 MULTIPLICATION 


P7 Po Ps P4 P3 P2 Py Po 


Figure 3.35 =A 4x 4 multiplier circuit. 


A circuit that implements the preceding operations is arranged in an array, as shown in 
Figure 3.35. In this figure the AND gates and full-adders that produce the partial products 
are shown in blue, in the same way as the blue highlighted rows in Figure 3.34c. The 
full-adders are connected to form ripple-carry adders. It is possible to design even faster 
multipliers by using other types of adders [1]. 


3.6.2 MULTIPLICATION OF SIGNED NUMBERS 


Multiplication of unsigned numbers illustrates the main issues involved in the design of 
multiplier circuits. Multiplication of signed numbers is somewhat more complex. 

If the multiplier operand is positive, it is possible to use essentially the same scheme as 
for unsigned numbers. For each bit of the multiplier operand that is equal to 1, a properly 
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shifted version of the multiplicand must be added to the partial product. The multiplicand 
can be either positive or negative. 

Since shifted versions of the multiplicand are added to the partial products, it is 
important to ensure that the numbers involved are represented correctly. For example, 
if the two right-most bits of the multiplier are both equal to 1, then the first addition 
must produce the partial product PP1 = M + 2M, where M is the multiplicand. If M = 
My—{My_2+++m\mMo, then PP1 = my_My_2 +++ MyM + My_{My_2+++m,m_0. The adder 
that performs this addition comprises circuitry that adds two operands of equal length. 
Since shifting the multiplicand to the left, to generate 2M, results in one of the operands 
having n+ 1 bits, the required addition has to be performed using the second operand, 
M, represented also as an (n+ 1)-bit number. An n-bit signed number is represented 
as an (n+ 1)-bit number by using sign extension, that is, by replicating the sign bit as 
the new left-most bit. Thus M = m,_,mpy_2--- mmo is represented using (n + 1) bits as 
M = my—|My—1Mp—2 +++ MyM. 

When a shifted version of the multiplicand is added to a partial product, overflow has 
to be avoided. Hence the new partial product must be larger by one extra bit. Figure 
3.36a illustrates the process of multiplying two positive numbers. The sign-extended bits 
are shown in blue. Part (b) of the figure involves a negative multiplicand. Note that the 
resulting product has 2n bits in both cases. 

For a negative multiplier operand, it is possible to convert both the multiplier and the 
multiplicand into their 2’s complements because this will not change the value of the result. 
Then the scheme for a positive multiplier can be used. 

We have presented a relatively simple scheme for multiplication of signed numbers. 
There exist other techniques that are more efficient but also more complex. We will not 
pursue these techniques, but an interested reader may consult reference [1]. 

We have discussed circuits that perform addition, subtraction, and multiplication. An- 
other arithmetic operation that is needed in computer systems is division. Circuits that 
perform division are more complex; we will present an example in Chapter 7. Various 
techniques for performing division are usually discussed in books on the subject of com- 
puter organization and can be found in references [1, 2]. 


3.7 OTHER NUMBER REPRESENTATIONS 


In the previous sections we dealt with binary integers represented in the positional number 
representation. Other types of numbers are also used in digital systems. In this section we 
will discuss briefly three other types: fixed-point, floating-point, and binary-coded decimal 
numbers. 


3.7.1 FIXED-POINT NUMBERS 


A fixed-point number consists of integer and fraction parts. It can be written in the positional 
number representation as 


B= by by_2- ++ by bo.b_\b_2 +++ b_x 
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Multiplicand M (+14) 01110 
Multiplier Q (+11) x 01011 
Partial product 0 0001110 
+ 001110 
Partial product 1 0010101 
+ 000000 
Partial product 2 0001010 
+ 001110 
Partial product 3 0010011 
+ 000000] 
Product P (+154) OO0O1LO0011010 


(a) Positive multiplicand 


Multiplicand M C14) 10010 
Multiplier Q (+11) x 01011 
Partial product 0 1110010 
+ 110010 
Partial product 1 1101011 
+ 000000 
Partial product 2 1110101 
+ 110010 
Partial product 3 1101100 
+ 000000] 
Product P (-154) 1101100110 


(b) Negative multiplicand 


Figure 3.36 —= Multiplication of signed numbers. 


The value of the number is 


n—-1 
V(B) = D0 x 2' 


i=—k 


The position of the radix point is assumed to be fixed; hence the name fixed-point number. 
If the radix point is not shown, then it is assumed to be to the right of the least-significant 
digit, which means that the number is an integer. 

Logic circuits that deal with fixed-point numbers are essentially the same as those used 
for integers. We will not discuss them separately. 
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3.7.2 FLOATING-POINT NUMBERS 


Fixed-point numbers have a range that is limited by the significant digits used to represent 
the number. For example, if we use eight digits and a sign to represent decimal integers, 
then the range of values that can be represented is 0 to +99999999. If eight digits are 
used to represent a fraction, then the representable range is 0.00000001 to +0.99999999. 
In scientific applications it is often necessary to deal with numbers that are very large or 
very small. Instead of using the fixed-point representation, which would require many 
significant digits, it is better to use the floating-point representation in which numbers are 
represented by a mantissa comprising the significant digits and an exponent of the radix R. 
The format is 


Mantissa x RE Ponent 
The numbers are often normalized, such that the radix point is placed to the right of the first 
nonzero digit, as in 5.234 x 10*3 or 6.31 x 10728. 

Binary floating-point representation has been standardized by the Institute of Electrical 
and Electronic Engineers (IEEE) [3]. Two sizes of formats are specified in this standard— 
a single-precision 32-bit format and a double-precision 64-bit format. Both formats are 
illustrated in Figure 3.37. 


Single-Precision Floating-Point Format 


Figure 3.37a depicts the single-precision format. The left-most bit is the sign bit—0 
for positive and 1 for negative numbers. There is an 8-bit exponent field, , and a 23-bit 
mantissa field, M. The exponent is with respect to the radix 2. Because it is necessary to 


32 bits; — AMA rr 


Si 
0 aries + 8-bit 23 bits of mantissa 
1 denotes — excess-127 

exponent 


(a) Single precision 


64 bits AAA ar 


11-bit excess-1023 52 bits of mantissa 
exponent 


Sign 


(b) Double precision 


Figure 3.37 —_|EEE Standard floating-point formats. 
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be able to represent both very large and very small numbers, the exponent can be either 
positive or negative. Instead of simply using an 8-bit signed number as the exponent, which 
would allow exponent values in the range —128 to 127, the IEEE standard specifies the 
exponent in the excess-127 format. In this format the value 127 is added to the value of the 
actual exponent so that 


Exponent = E — 127 


In this way E becomes a positive integer. This format is convenient for adding and subtract- 
ing floating-point numbers because the first step in these operations involves comparing the 
exponents to determine whether the mantissas must be appropriately shifted to add/subtract 
the significant bits. The range of E is 0 to 255. The extreme values of EF = 0 and FE = 255 
are taken to denote the exact zero and infinity, respectively. Therefore, the normal range of 
the exponent is —126 to 127, which is represented by the values of E' from | to 254. 

The mantissa is represented using 23 bits. The IEEE standard calls for a normalized 
mantissa, which means that the most-significant bit is always equal to 1. Thus it is not 
necessary to include this bit explicitly in the mantissa field. Therefore, if M is the bit vector 
in the mantissa field, the actual value of the mantissa is 1.M, which gives a 24-bit mantissa. 
Consequently, the floating-point format in Figure 3.37a represents the number 


Value = +1.M x 22-127 


The size of the mantissa field allows the representation of numbers that have the precision 
of about seven decimal digits. The exponent field range of 2~!*° to 2!77 corresponds to 
about 10*°8, 


Double-Precision Floating-Point Format 

Figure 3.37b shows the double-precision format, which uses 64 bits. Both the exponent 
and mantissa fields are larger. This format allows greater range and precision of numbers. 
The exponent field has 11 bits, and it specifies the exponent in the excess-1023 format, 
where 


Exponent = E — 1023 


The range of F is 0 to 2047, but again the values E = 0 and E = 2047 are used to indicate 
the exact zero and infinity, respectively. Thus the normal range of the exponent is — 1022 
to 1023, which is represented by the values of E from | to 2046. 

The mantissa field has 52 bits. Since the mantissa is assumed to be normalized, its 
actual value is again 1.M. Therefore, the value of a floating-point number is 


Value = +1.M x 2£~1023 


This format allows representation of numbers that have the precision of about 16 decimal 
digits and the range of approximately 10*°8. 

Arithmetic operations using floating-point operands are significantly more complex 
than signed integer operations. Because this is a rather specialized domain, we will not 
elaborate on the design of logic circuits that can perform such operations. For a more 
complete discussion of floating-point operations, the reader may consult references [1, 2]. 
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3.7.3 BINARY-CODED-DECIMAL REPRESENTATION 


In digital systems it is possible to represent decimal numbers simply by encoding each digit 
in binary form. This is called the binary-coded-decimal (BCD) representation. Because 
there are 10 digits to encode, it is necessary to use four bits per digit. Each digit is encoded 
by the binary pattern that represents its unsigned value, as shown in Table 3.3. Note that 
only 10 of the 16 available patterns are used in BCD, which means that the remaining 6 
patterns should not occur in logic circuits that operate on BCD operands; these patterns 
are usually treated as don’t-care conditions in the design process. BCD representation was 
used in some early computers as well as in many handheld calculators. Its main virtue is 
that it provides a format that is convenient when numerical information is to be displayed 
on a simple digit-oriented display. Its drawbacks are complexity of circuits that perform 
arithmetic operations and the fact that six of the possible code patterns are wasted. 

Even though the importance of BCD representation has diminished, it is still encoun- 
tered. To give the reader an indication of the complexity of the required circuits, we will 
consider BCD addition in some detail. 


BCD Addition 


The addition of two BCD digits is complicated by the fact that the sum may exceed 
9, in which case a correction will have to be made. Let X = x3x2x\xo and Y = y3y2y1 yo 
represent the two BCD digits and let S = 53525;59 be the desired sum digit, S = X + Y. 
Obviously, if X + Y < 9, then the addition is the same as the addition of 2 four-bit unsigned 
binary numbers. But, if X + Y > 9, then the result requires two BCD digits. Moreover, 
the four-bit sum obtained from the four-bit adder may be incorrect. 

There are two cases where some correction has to be made: when the sum is greater 
than 9 but no carry-out is generated using four bits, and when the sum is greater than 15 so 


Table 3.3. Binary-coded 
decimal digits. 


Decimal digit BCD code 
0 0000 
0001 
0010 
0011 
0100 
0101 
0110 
O11 
1000 
1001 


oO mA HN FWY = 
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xX O1lll 7 
+ Y +0101 +5 
Zi 1100 12 
+0110 

carry —e 10010 
eee Y gol 
S=2 
xX 1000 8 
+ Y +1001 +9 
Z: 10001 17 
+0110 
carry—e 10111 
=v 
SS 7 


Figure 3.38 — Addition of BCD digits. 


that a carry-out is generated using four bits. Figure 3.38 illustrates these cases. In the first 
case the four-bit addition yields Z = 7 + 5 = 12. To obtain a correct BCD result, we must 
generate S = 2 and a carry-out of 1. The necessary correction is apparent from the fact 
that the four-bit addition is a modulo-16 scheme, whereas decimal addition is a modulo-10 
scheme. Therefore, a correct decimal digit can be generated by adding 6 to the result of 
four-bit addition whenever this result exceeds 9. Thus we can arrange the computation as 
follows 


Z=X+Y 
If Z < 9, then S = Z and carry-out = 0 
ifZ > 9, then S = Z + 6 and carry-out = 1 


The second example in Figure 3.38 shows what happens when X + Y > 15. In this case 
Z = 17, which is a five-bit binary number. Note that the four least-significant bits, Z3_0, 
represent the value 1, while Z, represents the value 16. To generate a correct BCD result, 
it is again necessary to add 6 to the intermediate sum Z3_9 to produce the value 7. The bit 
Zz is the carry to the next digit position. 

Figure 3.39 gives a block diagram of a one-digit BCD adder that is based on this 
scheme. The block that detects whether Z > 9 produces an output signal, Adjust, which 
controls the multiplexer that provides the correction when needed. A second four-bit adder 
generates the corrected sum bits. Since whenever the adjustment of 6 is made there is a 
carry to the next digit position, Co; 1s just equal to the Adjust signal. 

The one-digit BCD adder can be specified in Verilog code by describing its behavior 
as shown in Figure 3.40. Inputs X and Y, and output S are defined as four-bit numbers. 
The intermediate sum, Z, is defined as a five-bit number. The if-else statement is used to 
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carry-out 


4-bit adder 


Cout 


Figure 3.39 Block diagram for a one-digit BCD adder. 


module bcdadd (Cin, X, Y, S, Cout); 
input Cin; 
input [3:0] X, Y; 
output reg [3:0] S; 
output reg Cout; 


reg [4:0] Z; 
always @(X, Y, Cin) 
begin 
Z=X+Y+Cin; 
if (Z < 10) 
{Cout, S} = Z; 
else 
{Cout, S} = Z + 6; 
end 
endmodule 


Figure 3.40 _ Verilog code for a one-digit BCD adder. 
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provide the adjustment explained above; hence it is not necessary to use an explicit Adjust 
signal. 

If we wish to derive a circuit to implement the block diagram in Figure 3.39 by hand, 
instead of by using Verilog, then the following approach can be used. To define the Adjust 
function, we can observe that the intermediate sum will exceed 9 if the carry-out from the 
four-bit adder is equal to 1, or if z3 = | and either z2 or z, (or both) are equal to 1. Hence 
the logic expression for this function is 


Adjust = Carry-out + z3(z2 + z1) 


Instead of implementing another complete four-bit adder to perform the correction, we can 
use a simpler circuit because the addition of constant 6 does not require the full capability 
of a four-bit adder. Note that the least-significant bit of the sum, so, is not affected at all; 
hence sp = Z. A two-bit adder may be used to develop bits sz and s. Bit 53 is the same as 
z3 if the carry-out from the two-bit adder is 0, and it is equal to Z3 if this carry-out is equal 
to 1. A complete circuit that implements this scheme is shown in Figure 3.41. Using the 


Xz XX Xp ¥3 Yo Vy Yo 


edd Edy 


Joh iT 
Tat 


Two-bit adder 


Cout 53 So Sy So 


Figure 3.41 — Circuit for a one-digit BCD adder. 
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one-digit BCD adder as a basic block, it is possible to build larger BCD adders in the same 
way as a binary full-adder is used to build larger ripple-carry binary adders. 

Subtraction of BCD numbers can be handled with the radix-complement approach. Just 
as we use 2’s complement representation to deal with negative binary numbers, we can use 
10’s complement representation to deal with decimal numbers. We leave the development 
of such a scheme as an exercise for the reader (see Problem 3.19). 


3.8 EXAMPLES OF SOLVED PROBLEMS 


This section presents some typical problems that the reader may encounter and shows how 
such problems can be solved. 


Example 3.6 Problem: Convert the decimal number 14959 into a hexadecimal number. 


Solution: An integer is converted into the hexadecimal representation by successive divi- 
sions by 16, such that in each step the remainder is a hex digit. To see why this is true, 
consider a four-digit number H = h3hzh;ho. Its value is 


V=h,x 16+ x 167 +h, x 16+ ho 


If we divide this by 16, we obtain 
V 7) ho 
—=)} 16° +h 16+h — 
16 3X + hy xX 16+, + 16 


Thus, the remainder gives ho. Figure 3.42 shows the steps needed to perform the conversion 
(14959) 19 = BAGF) 16. 


Convert (14959) 19 
Remainder Hex digit 
14959 + 16 = 934 15 F LSB 
934+16 = 58 6 6 
58+16 = 3 10 A 
3+16 = O 3 3 MSB 


Result is (3A6F) 16 


Figure 3.42 — Conversion from decimal to hexadecimal. 


3.8 EXAMPLES OF SOLVED PROBLEMS 


179 


Problem: Convert the decimal fraction 0.8254 into binary representation. 


Solution: As indicated in Section 3.7.1, a binary fraction is represented as the bit pattern 
B=0.b_,\b_2---b_,» and its value is 


Vad) x2! 4+b4x274-+-+ din x 2 
Multiplying this expression by 2 gives 
by + bg X21 400+ + din x 2D 


Here, the left-most term is the first bit to the right of the radix point. The remaining terms 
constitute another binary fraction which can be manipulated in the same way. Therefore, 
to convert a decimal fraction into a binary fraction, we multiply the decimal number by 
2 and set the computed bit to 0 if the product is less than | and set it to 1 if the product 
is greater than or equal to 1. We repeat this calculation until a sufficient number of bits 
are obtained to meet the desired accuracy. Note that it may not be possible to represent a 
decimal fraction with a binary fraction that has exactly the same value. Figure 3.43 shows 
the required computation that yields (0.8254) 19 = (O.11010011...)2. 


Example 3.7 


Convert (0.8254), 


0.8254 x2 = 1.6508 L-_» 1 MSB 
0.6508 x 2 = 13016 L-—» 1 
0.3016 x2 = 0.6032 L-—__» 0 
0.6032 x2 = 12064: 1 
0.2064 x2 = 04g L + 0 
0.4128 x2 = 08256 L——-+ 0 
0.8256 x2 = tesi2 LL» 1 
0.6512 x2 = 13024 LL» 1 LSB 


(0.8254) 9 = (0.11010011...), 


Figure 3.43 = Conversion of fractions from decimal to binary. 
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Example 3.8 


Problem: Convert the decimal fixed point number 214.45 into a binary fixed point number. 


Solution: For the integer part perform successive division by 2 as illustrated in Fig- 
ure 1.6. For the fractional part perform successive multiplication by 2 as described in Ex- 
ample 3.7. The complete computation is presented in Figure 3.44, producing (214.45) i9 = 
(11010110.0111001...)o. 


Example 3.9 


Problem: In computer computations it is often necessary to compare numbers. Two four-bit 
signed numbers, X = x3x2x;x9 and Y = y3y2y; yo, can be compared by using the subtractor 
circuit in Figure 3.45, which performs the operation X — Y. The three outputs denote the 
following: 


e Z = 1 if the result is 0; otherwise Z = 0 
® N = 1 if the result is negative; otherwise N = 0 
® V = 1 if arithmetic overflow occurs; otherwise V = 0 


Show how Z, N, and V can be used to determine the cases X = Y,X <Y,X <Y,X>Y, 
and X > Y. 


Solution: Consider first the case X < Y, where the following possibilities may arise: 


e If X and Y have the same sign there will be no overflow, hence V = 0. Then for both 
positive and negative X and Y the difference will be negative (NV = 1). 


® If X is negative and Y is positive, the difference will be negative (NV = 1) if there is 
no overflow (V = 0); but the result will be positive (V = 0) if there is overflow (V = 1). 


Therefore, if X < Y thnN @V = 1. 

The case X = Y is detected by Z = 1. Then, X < Y is detected by Z+ (VN @V) = 1. 
The last two cases are just simple inverses: X > Y if Z+(N@V)=1 and X => Y if 
NOV=l1. 


Example 3.10 


Problem: Write Verilog code to specify the circuit in Figure 3.45. 


Solution: We can specify the circuit using the approach given in Figure 3.23, as indicated 
in Figure 3.46. Note that the statement 


assign Z = !S; 
will produce | only if S = 53525159 = 0000. Thus, it represents the four-bit NOR function 
Z=83+524+ 81 + So 


Instantiating each full adder separately becomes awkward when large circuits are in- 
volved, as would be the case if the comparator had 32-bit operands. An alternative is to use 
the generic specification presented in Figure 3.24, as shown in Figure 3.47. 
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Convert (214.45), 


3 
2 


L 
2 
0.45 x 2 
0.90 x 2 
0.80 x 2 
0.60 x 2 
0.20 x 2 
0.40 x 2 


0.80 x 2 


r 


107 + = 0 LSB 


nn 
ies) 
+ 


N 

n 

+ 
~ | ~] 
— 


13+-= 0 


r 


ion 
+ 


e 
+ 


reer 


So 
+ 


io boy 1 LSB 


(214.45), = (11010110.0111001...)> 


Figure 3.44 — Conversion of fixed point numbers from decimal to 


binary. 
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Vv N Z 


(overflow) (negative) (zero) 


Figure 3.45 = A comparator circuit. 


module comparator (X, Y, V, N, Z); 
input [3:0] X, Y; 


output V,N, Z; 

wire [3:0] S; 

wire [4:1] C; 

fulladd stageO (1’b1, X[0], ~ Y[O], S[0], C[1]); 
fulladd stagel (C[1], X[1],~Y[1], S[1], C[2)); 
fulladd stage2 (C[2], X[2],~Y[2], S[2], C[3]); 
fulladd stage3 (C[3], X[3], ~ Y[3], S[3], C[4])); 


assign V =C[4] * C[3]; 
assign N = S[3]; 
assign Z = !S; 


endmodule 


module fulladd (Cin, x, y, s, Cout); 
input Cin, x, y; 
output s, Cout; 


assign s=x “y * Cin; 
assign Cout = (x & y) | (k & Cin) | (y & Cin); 


endmodule 


Figure 3.46 — Structural Verilog code for the comparator circuit. 
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module comparator (X, Y, V, N, Z); 
parameter n = 32; 
input [n—1:0] X, Y; 
output reg V,N, Z; 
reg [n—1:0] S; 
reg [n:0] C; 
integer k; 


always @(X, Y) 


begin 
C[O] = 1’b1; 
for (k =0;k < n; k=k+1) 
begin 


S[k] = X[k] “~ Y[k] “ C[k]; 
C[k+1] = (X[k] & ~YT[k]) | CX{k] & C[k]) | (~Y[k] & C[k]); 


end 
V =C[n] * C[n-1]; 
N=S[n-1]; 
Z=!S; 
end 
endmodule 


Figure 3.47 — Generic Verilog code for the comparator circuit. 


Problem: Figure 3.35 depicts a four-bit multiplier circuit. Each row in this circuit consists Example 3.11 
of four full-adder (FA) blocks connected in a ripple-carry configuration. The delay caused 
by the carry signals rippling through the rows has a significant impact on the time needed 
to generate the output product. In an attempt to speed up the circuit, we may use the 
arrangement in Figure 3.48. Here, the carries in a given row are “saved” and included in 
the next row at the correct bit position. Then, in the first row the full-adders can be used 
to add three properly shifted bits of the multiplicand as selected by the multiplier bits. For 
example, in bit position 2 the three inputs are m2q0, m1q1, and moqz. In the last row it is 
still necessary to use the ripple-carry adder. A circuit that consists of an array of full-adders 
connected in this manner is called a carry-save adder array. 

What is the total delay of the circuit in Figure 3.48 compared to that of the circuit in 
Figure 3.35? 


Solution: In the circuit in Figure 3.35 the longest path is through the right-most two full- 
adders in the top row, followed by the two right-most FAs in the second row, and then 
through all four FAs in the bottom row. Hence this delay is eight times the delay through a 
full-adder block. In addition, there is the AND-gate delay needed to form the inputs to the 
first FA in the top row. These combined delays are the critical delay, which determines the 
speed of the multiplier circuit. 
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™340 M40 ™140 ™o40 


P7 Po P5 P4 P3 P2 Pi Po 


Figure 3.48 — Multiplier carry-save array. 


In the circuit in Figure 3.48, the longest path is through the right-most FAs in the first 
and second rows, followed by all four FAs in the bottom row. Therefore, the critical delay 
is six times the delay through a full-adder block plus the AND-gate delay needed to form 
the inputs to the first FA in the top row. 


*3.1 


*3.2 


*3.3 


| PROBLEMS 


Answers to problems marked by an asterisk are given at the back of the book. 


Determine the decimal values of the following unsigned numbers: 
(a) (0111011110) 

(b) (1011100111) 

(c) 3751s 

(d) (A25F) 16 

(e) (FOFO) 16 


Determine the decimal values of the following 1’s complement numbers: 
(a) 0111011110 
(b) 1011100111 
(c) 1111111110 


Determine the decimal values of the following 2’s complement numbers: 
(a) 0111011110 
(b) 1011100111 
(c) 1111111110 


*3.4 
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Convert the decimal numbers 73, 1906, —95, and — 1630 into signed 12-bit numbers in the 
following representations: 

(a) Sign and magnitude 

(b) 1’s complement 

(c) 2’s complement 

Perform the following operations involving eight-bit 2’s complement numbers and indicate 
whether arithmetic overflow occurs. Check your answers by converting to decimal sign- 
and-magnitude representation. 


00110110 01110101 11011111 
+ 01000101 + 11011110 + 10111000 


00110110 01110101 11010011 
— 00101011 — 11010110 — 11101100 


Prove that the XOR operation is associative, which means that x; ® (y; ® z;) = (4 ® y;) ® 
ie 

Show that the circuit in Figure 3.4 implements the full-adder specified in Figure 3.3a. 
Prove the validity of the simple rule for finding the 2’s complement of a number, which 
was presented in Section 3.3. Recall that the rule states that scanning a number from right 
to left, all Os and the first 1 are copied; then all remaining bits are complemented. 

Prove the validity of the expression Overflow = c, ® cn,-1 for addition of n-bit signed 
numbers. 


Verify that a carry-out signal, cx, from bit position k — | of an adder circuit can be generated 
aS Ce = X, BY_ OB Sx, where x, and y; are inputs and s,; is the sum bit. 

Consider the circuit in Figure P3.1. Can this circuit be used as one stage in a ripple-carry 
adder? Discuss the pros and cons. The operation of transistors shown in the figure is 
described in Appendix B. 

Determine the number of gates needed to implement an n-bit carry-lookahead adder, as- 
suming no fan-in constraints. Use AND, OR, and XOR gates with any number of inputs. 
Determine the number of gates needed to implement an eight-bit carry-lookahead adder 
assuming that the maximum fan-in for the gates is four. 

In Figure 3.17 we presented the structure of a hierarchical carry-lookahead adder. Show 
the complete circuit for a four-bit version of this adder, built using 2 two-bit blocks. 

What is the critical delay path in the multiplier in Figure 3.35? What is the delay along this 
path in terms of the number of gates? 


Write a Verilog module to describe the 4 x 4 multiplier shown in Figure 3.35. Synthesize 
a circuit from the code and verify its functional correctness. 


Consider the Verilog code in Figure P3.2. Given the relationship between the signals IN and 
OUT, what is the functionality of the circuit described by the code? Comment on whether 
or not this code represents a good style to use for the functionality that it represents. 


Design a circuit that generates the 9’s complement of a BCD digit. Note that the 9’s 
complement of d is 9 — d. 
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c 


Figure P3.1 Circuit for Problem 3.11. 


module problem3_17 (IN, OUT); 
input [3:0] IN; 
output reg [3:0] OUT; 


always @(IN) 

if (IN == 4’b0101) OUT = 4’b0001; 
else if (IN == 4’b0110) OUT =4’b0010; 
else if (IN == 4’b0111) OUT =4’b0011; 
else if (IN == 4’b1001) OUT = 4’b0010; 
else if (IN == 4’b1010) OUT = 4’b0100; 
else if (IN == 4’b1011) OUT =4’b0110; 
else if (IN == 4’b1101) OUT =4’b0011; 
else if (IN == 4’b1110) OUT = 4’b0110; 
else if (IN == 4’b1111) OUT =4’b1001; 
else OUT = 4’b0000; 


endmodule 


Figure P3.2 — The code for Problem 3.17. 
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Derive a scheme for performing subtraction using BCD operands. Show a block diagram 
for the subtractor circuit. Hint: Subtraction can be performed easily if the operands are 
in the 10’s complement (radix complement) representation. In this representation the sign 
digit is O for a positive number and 9 for a negative number. 


Write complete Verilog code for the circuit that you derived in Problem 3.19. 


Suppose that we want to determine how many of the bits in a three-bit unsigned number 
are equal to 1. Design the simplest circuit that can accomplish this task. 


Repeat Problem 3.21 for a six-bit unsigned number. 
Repeat Problem 3.21 for an eight-bit unsigned number. 


Show a graphical interpretation of three-digit decimal numbers, similar to Figure 3.11. The 
left-most digit is 0 for positive numbers and 9 for negative numbers. Verify the validity of 
your answer by trying a few examples of addition and subtraction. 


In a ternary number system there are three digits: 0, 1, and 2. Figure P3.3 defines a ternary 
half-adder. Design a circuit that implements this half-adder using binary-encoded signals, 
such that two bits are used for each ternary digit. Let A = a,dp, B = bi bo, and Sum = 5150; 
note that Carry is just a binary signal. Use the following encoding: 00 = (0)3, 01 = (1)3, 
and 10 = (2)3. Minimize the cost of the circuit. 


AB | Carry | Sum 
00 0 0 
01 0 1 
02 0 2 
10 0 1 
11 0 2 
12 1 0 
20 0 2 
21 1 0 
22 1 1 


Figure P3.3. Ternary half-adder. 


Design a ternary full-adder circuit, using the approach described in Problem 3.25. 
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3.27 Consider the subtractions 26 — 27 = 99 and 18 — 34 = 84. Using the concepts presented 
in Section 3.3.4, explain how these answers (99 and 84) can be interpreted as the correct 
signed results of these subtractions. 
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COMBINATIONAL-CIRCUIT BUILDING 
BLOCKS 


CHAPTER OBJECTIVES 


In this chapter you will learn about: 


e Commonly used combinational subcircuits 


e Multiplexers, which can be used for selection of signals and for implementation 
of general logic functions 


e Circuits used for encoding, decoding, and code-conversion purposes 
e Key Verilog constructs used to define combinational circuits 
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Previous chapters have introduced the basic techniques for design of logic circuits. In practice, a few types 
of logic circuits are often used as building blocks in larger designs. This chapter discusses a number of these 
blocks and gives examples of their use. The chapter also includes a major section on Verilog, which describes 
several key features of the language. 


| 4.1 MULTIPLEXERS 


Multiplexers were introduced briefly in Chapter 2. A multiplexer circuit has a number 
of data inputs, one or more select inputs, and one output. It passes the signal value on 
one of the data inputs to the output. The data input is selected by the values of the select 
inputs. Figure 4.1 shows a 2-to-1 multiplexer. Part (a) gives the symbol commonly used. 
The select input, s, chooses as the output of the multiplexer either input wo or w;. The 
multiplexer’s functionality can be described in the form of a truth table as shown in part (b) 
of the figure. Part (c) gives a sum-of-products implementation of the 2-to-1 multiplexer. 
Part (d) illustrates how it can be constructed with transmission gates which are discussed 
in Appendix B. 

Figure 4.2a depicts a larger multiplexer with four data inputs, wo, ...,w3, and two 
select inputs, s; and so. As shown in the truth table in part (b) of the figure, the two-bit 
number represented by 5159 selects one of the data inputs as the output of the multiplexer. 


s ‘s 
Wo f 0 Wo 
we 1 Ww) 
(a) Graphical symbol (b) Truth table 
Wo Wo 
AY ft AY 
7 ual f 
(c) Sum-of-products circuit (d) Circuit with transmission gates 


Figure 4.1. A 2-to-1 multiplexer. 
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So 

Sy 
Wo 0 O Wo 

Wy 0] Ww 

Wo , ; 
1 0 W 

W3 
os 1 1 W3 
(a) Graphical symbol (b) Truth table 


(c) Circuit 


Figure 4.2 — A 4-to-1 multiplexer. 


A sum-of-products implementation of the 4-to-1 multiplexer appears in Figure 4.2c. It 
realizes the multiplexer function 


f = S1Sowo + 81 SoW1 + S1SqW2 + 51 50W3 


It is possible to build larger multiplexers using the same approach. Usually, the num- 
ber of data inputs, n, is an integer power of two. A multiplexer that has n data inputs, 
Wo, +--+; Wn-1, requires [ logon | select inputs. Larger multiplexers can also be constructed 
from smaller multiplexers. For example, the 4-to-1 multiplexer can be built using three 
2-to-1 multiplexers as illustrated in Figure 4.3. Figure 4.4 shows how a 16-to-1 multiplexer 
is constructed with five 4-to-1 multiplexers. 
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Sy 


So 


Ww 1 
W. 2 
Ww 3 


Figure 4.3. Using 2-to-1 multiplexers to build a 4-to-1 
multiplexer. 


Figure 4.4 =A 16-to-1 multiplexer. 


4.1 MULTIPLEXERS 


Xy ————_» Yo 


(b) Implementation using multiplexers 


Figure 4.5 = A practical application of multiplexers. 
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Figure 4.5 shows a circuit that has two inputs, x; and x2, and two outputs, y; and yo. As 
indicated by the blue lines, the function of the circuit is to allow either of its inputs to be 
connected to either of its outputs, under the control of another input, s. A circuit that has 
n inputs and k outputs, whose sole function is to provide a capability to connect any input 
to any output, is usually referred to as an nxk crossbar switch. Crossbars of various sizes 
can be created, with different numbers of inputs and outputs. When there are two inputs 
and two outputs, it is called a 22 crossbar. 

Figure 4.55 shows how the 2 x2 crossbar can be implemented using 2-to-1 multiplexers. 
The multiplexer select inputs are controlled by the signal s. If s = 0, the crossbar connects 
x, to y; and x2 to y2, while if s = 1, the crossbar connects x; to yz and x2 to y;. Crossbar 
switches are useful in many practical applications in which it is necessary to be able to 
connect one set of wires to another set of wires, where the connection pattern changes from 
time to time. 


Example 4.1 


4.1.1 SYNTHESIS OF LOGIC FUNCTIONS USING MULTIPLEXERS 


Multiplexers are useful in many practical applications, such as the one described above. 
They can also be used in a more general way to synthesize logic functions. Consider the 
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Wo 
WwW, W f ie, 
0 
0 0 0 
0 1 1 1 
1 O 1 1 A 
1 1 0 0 


WwW Wo f w1 f 
oof} Pm 
1 0 1 1 Wo 
1 1 0 
(b) Modified truth table 
Wy 
a7) 
f 
(c) Circuit 


Figure 4.6 — Synthesis of a logic function using mutiplexers. 


example in Figure 4.6a. The truth table defines the function f = w; ® w2. This function 
can be implemented by a 4-to-1 multiplexer in which the values of f in each row of the 
truth table are connected as constants to the multiplexer data inputs. The multiplexer select 
inputs are driven by w; and w2. Thus for each valuation of w;w2, the output f is equal to 
the function value in the corresponding row of the truth table. 

The above implementation is straightforward, but it is not very efficient. A better 
implementation can be derived by manipulating the truth table as indicated in Figure 4.6), 
which allows f to be implemented by a single 2-to-1 multiplexer. One of the input signals, 
w in this example, is chosen as the select input of the 2-to-1 multiplexer. The truth table 
is redrawn to indicate the value of f for each value of w;. When w; = 0, f has the same 
value as input w2, and when w; = 1, f has the value of w2. The circuit that implements 
this truth table is given in Figure 4.6c. This procedure can be applied to synthesize a circuit 
that implements any logic function. 
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Wi) Wow3] f 
0 0 0 0 
0 0 1 0 
0 1 0 0 
O01 1 1 
1 0 0 0) 
1 01 1 
1 1 0 1 
1 11 1 
(a) Modified truth table 
Ws 
Wy 
0 
w 
: f 
1 
(b) Circuit 
Figure 4.7 Implementation of the three-input majority function 


using a 4-to-1 multiplexer. 


Figure 4.7a gives the truth table for the three-input majority function, and it shows how the 
truth table can be modified to implement the function using a 4-to-1 multiplexer. Any two 
of the three inputs may be chosen as the multiplexer select inputs. We have chosen w; and 
w2 for this purpose, resulting in the circuit in Figure 4.7b. 
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Figure 4.8a indicates how the function f = w; © w2 © w3 can be implemented using 2-to-1 
multiplexers. When w; = 0, f is equal to the XOR of w2 and w3, and when w, = 1, f 
is the XNOR of w2 and w3. Part (b) of the figure gives a corresponding circuit. The left 
multiplexer in the circuit produces w2 © w3, using the result from Figure 4.6, and the right 
multiplexer uses the value of w, to select either w2 © w3 or its complement. Note that we 
could have derived this circuit directly by writing the function as f = (w2 ® w3) @ wy. 
Figure 4.9 gives an implementation of the three-input XOR function using a 4-to-1 
multiplexer. Choosing w; and w9 for the select inputs results in the circuit shown. 


Example 4.3 
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WW W3 f 
0 0 0 0 
0 0 1 1 
w, Ow Wo 
010] 1 — wy 
0 1 1 0 W3 
1 0 0 1 f 
1 0 1 0 
w, Ow 
110] 0 as 
1 11 1 
(a) Truth table (b) Circuit 


Figure 4.8 — Three-input XOR implemented with 2-to-1 multiplexers. 


Wi Wow] f 
0 0 0 0 
Is, 
0 0 1 1 2 
G1 |) Otis 
Je 
0 11 0 3 
1 0 0 1 } ws f 
1 01 0) . 
1 1 0 0 } 
W3 
1 11 1 : 
(a) Truth table (b) Circuit 


Figure 4.9 — Three-input XOR implemented with a 4-to-1 multiplexer. 


4.1.2 MULTIPLEXER SYNTHESIS USING SHANNON’S EXPANSION 


Figures 4.6 through 4.9 illustrate how truth tables can be interpreted to implement logic 
functions using multiplexers. In each case the inputs to the multiplexers are the constants 
0 and 1, or some variable or its complement. Besides using such simple inputs, it is 
possible to connect more complex circuits as inputs to a multiplexer, allowing functions to 
be synthesized using a combination of multiplexers and other logic gates. Suppose that we 
want to implement the three-input majority function in Figure 4.7 using a 2-to-1 multiplexer 
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w,Ww.w3z] f 
0 0 0 0 
0 0 1 0 

0 WoW3 
a > 1 Wr+w 
Oa £:|) a oe 
1 0 0 0 
101 1 
1 1 0 1 
1 1 1 1 


(a) Truth table 


Wy) 


W3 


(b) Circuit 


Figure 4.10 The three-input majority function implemented using a 
2-to-1 multiplexer. 


in this way. Figure 4.10 shows an intuitive way of realizing this function. The truth table 
can be modified as shown on the right. If w; = 0, then f = w2w3, and if w; = 1, then 
f =w2+w3. Using w; as the select input for a 2-to-1 multiplexer leads to the circuit in 
Figure 4.10b. 

This implementation can be derived using algebraic manipulation as follows. The 
function in Figure 4.10a is expressed in sum-of-products form as 


f = Wiw2w3 + w1W2W3 + W1W2W3 + Wi W2W3 
It can be manipulated into 
f = W(w2w3) + wi (W2w3 + W2W3 + W2W3) 
= W1(W2w3) + wi (W2 + W3) 


which corresponds to the circuit in Figure 4.10b. 

Multiplexer implementations of logic functions require that a given function be decom- 
posed in terms of the variables that are used as the select inputs. This can be accomplished 
by means of a theorem proposed by Claude Shannon [1]. 
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Shannon’s Expansion Theorem Any Boolean function f (w),..., w,) can be written in 
the form 


SF (W1, Wa, ---,Wn) =W1-f (0, wo, .--, Wa) tw “fC, Wo, .--, Wn) 


This expansion can be done in terms of any of the n variables. We will leave the proof of 
the theorem as an exercise for the reader (see Problem 4.9). 

To illustrate its use, we can apply the theorem to the three-input majority function, 
which can be written as 


fW1, W2, W3) = WiW2 + Wiw3 + W2W3 
Expanding this function in terms of w; gives 
f =Wwi(0-w2 + 0-w3 + w2w3) + wi(1 - w2 + 1 w3 + Wow3) 
= Wi (w2w3) + wi (W2 + W3) 


which is the expression that we derived above. 
For the three-input XOR function, we have 


f =Ww, Ow. @ w3 
=w (0 8 w2 € w3) + wi (1 @ w2 @ w3) 


W 1 - (W2 B w3) + Wy - (W2 © 3) 


which gives the circuit in Figure 4.8. 


In Shannon’s expansion the term/ (0, w2, ..., w,,) is called the cofactor off with respect 
to Ww; it is denoted in shorthand notation as f;,. Similarly, the term f(1, w2,..., Wn) 18 
called the cofactor of f with respect to w1, written f,,. Hence we can write 
f= wif, + Wifw, 
In general, if the expansion is done with respect to variable w;, then fg, denotes 
Sf1,.--, Wi-1, 0, Wi41, ---, Wn), fw; denotes f(w1,..., Wi-1, 1, Wi41,---, Wn), and 


fW1,.--, Wn) = Wife, + wih; 


The complexity of the logic expression may vary depending on which variable, w;, is used, 
as illustrated in Example 4.4. 


For the function f = w,;w3 + w2W3, decomposition using w; gives 
f = Wife, + Wifw: 
= Wi(w3 + W2) + wi (w2W3) 
Using w2 instead of w, produces 
f = Wofirs + Wofwvs 
= W2(W1W3) + wo(Wiw3 + W3) 


= W2(W1W3) + W2(W) + W3) 
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Finally, using w3 gives 


f = Wah oF Wf; 
= W3(w2) + w3(W1) 
The results generated using w; and w2 have the same cost, but the expression produced 
using w3 has a lower cost. In practice, when performing decompositions of this type it is 
useful to try a number of alternatives and choose the one that produces the best result. 


Shannon’s expansion can be done in terms of more than one variable. For example, 
expanding a function in terms of w; and w2 gives 


f1,---,Wn) = Wiw2- f (0, 0, w3,.-., Wa) + Wiwe -f (0, 1, w3,..-, Wn) 
+ wiwo-f (1, 0, w3,.--,Wp) twiwe - fC, 1, W3,---, Wn) 


This expansion gives a form that can be implemented using a 4-to-1 multiplexer. If Shan- 
non’s expansion is done in terms of all n variables, then the result is the canonical sum-of- 
products form, which was defined in Section 2.6.1. 


199 


Assume that we wish to implement the function 
f =Wwiw3 + wiw2 + wiw3 


using a 2-to-1 multiplexer and any other necessary gates. Shannon’s expansion using w; 
gives 


f = Wifw + Wifw, 
= W1(W3) + wi(W2 + W3) 


The corresponding circuit is shown in Figure 4.11a. Assume now that we wish to use a 
4-to-1 multiplexer instead. Further decomposition using w2 gives 


f = WiWofiim + WiW2fiiws + WiWafwim + WiW2fwiws 


= W1W2(W3) + W1W2(W3) + W1W2(W3) + WiW2(1) 


The circuit is shown in Figure 4.11. 


Example 4.5 


Consider the three-input majority function 
f =wiw2 + wiw3 + w2w3 


We wish to implement this function using only 2-to-1 multiplexers. Shannon’s expansion 
using w yields 


SF = Wi (w2w3) + wi (w2 + w3 + w2w3) 


= W1(W2W3) + Wi (W2 + W3) 


Example 4.6 
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al 


W3 
M5 


(b) Using a 4-to-1 multiplexer 


Figure 4.11 The circuits synthesized in Example 4.5. 


Wo ual 


W3 


Figure 4.12 — The circuit synthesized in Example 4.6. 


Let g = w2w3 and h = w2 + w3. Expansion of both g and h using w2 gives 
& = W2(0) + w2(w3) 
h = w2(w3) + w2(1) 


The corresponding circuit is shown in Figure 4.12. It is equivalent to the 4-to-1 multiplexer 
circuit derived using a truth table in Figure 4.7. 
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4.2 DECODERS 


Consider the logic circuit in Figure 4.13. It has two inputs, w; and wo, and four outputs, 
yo, Y1, Y2, and y3. As shown in the truth table, only one of the outputs is asserted at a time, 
and each output corresponds to one valuation of the inputs. Setting the inputs w1wo to 00, 
01, 10, or 11 causes the output yo, yi, y2, or y3 to be set to 1, respectively. This type of 
circuit is called a binary decoder. Its inputs represent a binary number, which is decoded 
to assert the corresponding output. A circuit symbol and logic circuit for this decoder are 
shown in parts (b) and (c) of the figure. Each output is driven by an AND gate that decodes 
the corresponding valuation of w;wo. 

It is useful to include an enable input, En, in a decoder circuit, as illustrated in Fig- 
ure 4.14. When enabled by setting En = 1 the decoder behaves as presented in Figure 4.13. 


W, Wo Yo ¥1 Yo ¥3 


Wo y1 
Wy y2 


le =) 
= Of © 
oo oF 
oor°o 
oro eo 
rer ooo 


(a) Truth table (b) Graphical symbol 


Wo 


Yo 


Wy 


yt 


Y2 


¥3 


(c) Logic circuit 


Figure 4.13 = A 2-to-4 decoder. 
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(a) Truth table 
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0 0 
0 O 
1 0 
0 1 
0 0 


(c) Logic circuit 


Ww 0 


Wi 


En 


n : 
inputs . 


Enable 
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(b) Graphical symbol 


Yo 
yy 
y2 
¥3 
. on 
: outputs 


(d) An n-to-2” decoder 


Figure 4.14 Binary decoder. 
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But, if it is disabled by setting En = 0, then none of the outputs are asserted. Note that 
only five rows are shown in the truth table, because if En = 0 then all outputs are equal 
to 0 regardless of the values of w; and wo. The truth table indicates this by showing x 
when it does not matter whether the variable in question has the value 0 or 1. A graphical 
symbol for this decoder is given in Figure 4.14b. Part (c) of the figure shows how the enable 
capability can be included in the decoder of Figure 4.13c. A binary decoder with n inputs 
has 2” outputs. A graphical symbol for an n-to-2” decoder is shown in Figure 4.14d. 

A k-bit binary code in which exactly one of the bits is set to 1 at a time is referred to 
as one-hot encoded, meaning that the single bit that is set to 1 is deemed to be “hot.” The 
outputs of an enabled binary decoder are one-hot encoded. 

We should also note that decoders can be designed to have either active-high or active- 
low outputs. In our discussion, we have assumed that active-high outputs are needed. 

Larger decoders can be built using the sum-of-products structure in Figure 4.14c, or 
else they can be constructed from smaller decoders. Figure 4.15 shows how a 3-to-8 decoder 
is built with two 2-to-4 decoders. The w2 input drives the enable inputs of the two decoders. 
The top decoder is enabled if w2 = 0, and the bottom decoder is enabled if w2 = 1. This 
concept can be applied for decoders of any size. Figure 4.16 shows how five 2-to-4 decoders 
can be used to construct a 4-to-16 decoder. Because of its treelike structure, this type of 
circuit is often referred to as a decoder tree. 
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Decoders are useful for many practical purposes. In Figure 4.2c we showed the sum-of- 
products implementation of the 4-to-1 multiplexer, which requires AND gates to distinguish 
the four different valuations of the select inputs s; and so. Since a decoder evaluates the 
values on its inputs, it can be used to build a multiplexer as illustrated in Figure 4.17. The 
enable input of the decoder is not needed in this case, and it is set to 1. The four outputs of 
the decoder represent the four valuations of the select inputs. 


Example 4.7 


4.2.1 DEMULTIPLEXERS 


We showed in Section 4.1 that a multiplexer has one output, n data inputs, and [ logon | 
select inputs. The purpose of the multiplexer circuit is to multiplex the n data inputs onto 
the single data output under control of the select inputs. A circuit that performs the opposite 
function, namely, placing the value of a single data input onto multiple data outputs, is 
called a demultiplexer. The demultiplexer can be implemented using a decoder circuit. For 
example, the 2-to-4 decoder in Figure 4.14 can be used as a 1-to-4 demultiplexer. In this 
case the En input serves as the data input for the demultiplexer, and the yo to y3 outputs 
are the data outputs. The valuation of w;wo determines which of the outputs is set to the 
value of En. To see how the circuit works, consider the truth table in Figure 4.14a. When 
En = 0, all the outputs are set to 0, including the one selected by the valuation of w; wo. 
When En = 1, the valuation of w1wo sets the appropriate output to 1. 

In general, ann-to-2” decoder circuit can be used as a |-to-m demultiplexer. However, in 
practice decoder circuits are used much more often as decoders rather than as demultiplexers. 
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5 


Figure 4.15 — A 3-to-8 decoder using two 2-to-4 decoders. 
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Yi2 
Y13 
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Figure 4.16 =A 4-to-16 decoder built using a decoder tree. 


4.3. ENCODERS 


Figure 4.17 ~— A 4-to-1 multiplexer built using a decoder. 
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4.3 ENCODERS 


An encoder performs the opposite function of a decoder. It encodes given information into 
a more compact form. 


4.3.1 BINARY ENCODERS 


A binary encoder encodes information from 2” inputs into an n-bit code, as indicated in 
Figure 4.18. Exactly one of the input signals should have a value of 1, and the outputs 
present the binary number that identifies which input is equal to 1. The truth table for a 
4-to-2 encoder is provided in Figure 4.19a. Observe that the output yo is 1 when either 
input w; or w3 is 1, and output y; is 1 when input w2 or w3 is 1. Hence these outputs can be 
generated by the circuit in Figure 4.19b. Note that we assume that the inputs are one-hot 
encoded. All input patterns that have multiple inputs set to 1 are not shown in the truth 
table, and they are treated as don’t-care conditions. 

Encoders are used to reduce the number of bits needed to represent given information. 
A practical use of encoders is for transmitting information in a digital system. Encoding 
the information allows the transmission link to be built using fewer wires. Encoding is also 
useful if information is to be stored for later use because fewer bits need to be stored. 


4.3.2 PRIORITY ENCODERS 


Another useful class of encoders is based on the priority of input signals. In a priority 
encoder each input has a priority level associated with it. The encoder outputs indicate the 
active input that has the highest priority. When an input with a high priority is asserted, the 
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Figure 4.18 — A 2”-to-n binary encoder. 
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Figure 4.19 = A 4-to-2 binary encoder. 


other inputs with lower priority are ignored. The truth table for a 4-to-2 priority encoder is 
shown in Figure 4.20. It assumes that wo has the lowest priority and w3 the highest. The 
outputs y; and yo represent the binary number that identifies the highest priority input set 
to 1. Since it is possible that none of the inputs is equal to 1, an output, z, is provided to 
indicate this condition. It is set to | when at least one of the inputs is equal to 1. It is set to 
0 when all inputs are equal to 0. The outputs y; and yo are not meaningful in this case, and 
hence the first row of the truth table can be treated as a don’t-care condition for y; and yo. 
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Figure 4.20 Truth table for a 4-to-2 priority encoder. 


The behavior of the priority encoder is most easily understood by first considering 
the last row in the truth table. It specifies that if input w3 is 1, then the outputs are set to 
yiyo = 11. Because w3 has the highest priority level, the values of inputs w2, w1, and wo 
do not matter. To reflect the fact that their values are irrelevant, w2, w;, and wo are denoted 
by the symbol x in the truth table. The second-last row in the truth table stipulates that 
if w2 = 1, then the outputs are set to yjyo = 10, but only if w3 = 0. Similarly, input w; 
causes the outputs to be set to yj yo = 01 only if both w3 and w2 are 0. Input wo produces 
the outputs y; yo = 00 only if wo is the only input that is asserted. 

A logic circuit that implements the truth table can be synthesized by using the techniques 
developed in Chapter 2. However, a more convenient way to derive the circuit is to define 
a set of intermediate signals, ip, ..., 3, based on the observations above. Each signal, ix, 
is equal to 1 only if the input with the same index, w;, represents the highest-priority input 
that is set to 1. The logic expressions for ig, ..., i3 are 


ig = W3W2W1Wo 
ly = W3W2W] 
In = W3W2 


3 = W3 


Using the intermediate signals, the rest of the circuit for the priority encoder has the same 
structure as the binary encoder in Figure 4.19, namely 


YH tis 
YHht+h 


The output z is given by 


Z=lbththt+h 
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4.4 CopDE CONVERTERS 


The purpose of the decoder and encoder circuits is to convert from one type of input 
encoding to a different output encoding. For example, a 3-to-8 binary decoder converts 
from a binary number on the input to a one-hot encoding at the output. An 8-to-3 binary 
encoder performs the opposite conversion. There are many other possible types of code 
converters. One common example is a BCD-to-7-segment decoder, which was introduced 
in Section 2.14. A similar decoder is often used to display hexadecimal information on 
seven-segment displays. As explained in Section 3.1.2, long binary numbers are easier to 
deal with visually if they are represented in the hexadecimal form. A hex-to-7-segment 
decoder can be implemented as shown in Figure 4.21. Digits 0 to 9 are displayed the same 
as in the case of the BCD-to-7-segment decoder. Digits 10 to 15 are displayed as A, b, C, 
d, E, and F. 

We should note that although the word decoder is traditionally used for such circuits, a 
more appropriate term is code converter. The term decoder is more appropriate for circuits 
that produce one-hot encoded outputs. 


4.5 ARITHMETIC COMPARISON CIRCUITS 


Chapter 3 presented arithmetic circuits that perform addition, subtraction, and multiplication 
of binary numbers. Another useful type of arithmetic circuit compares the relative sizes 
of two binary numbers. Such a circuit is called a comparator. This section considers the 
design of a comparator that has two n-bit inputs, A and B, which represent unsigned binary 
numbers. The comparator produces three outputs, called AegB, AgtB, and A/tB. The AeqB 
output is set to | if A and B are equal. The Ag¢B output is | if A is greater than B, and the 
AItB output is | if A is less than B. 

The desired comparator can be designed by creating a truth table that specifies the three 
outputs as functions of A and B. However, even for moderate values of n, the truth table is 
large. A better approach is to derive the comparator circuit by considering the bits of A and 
B in pairs. We can illustrate this by a small example, where n = 4. 

Let A = a3a2a,;a) and B= b3bzb,bo. Define a set of intermediate signals called 
i3, i, i, and ig. Each signal, i,, is 1 if the bits of A and B with the same index are equal. 
That is, i, = ay © by. The comparator’s AegB output is then given by 


AegB = I31g11 ig 


An expression for the AgtB output can be derived by considering the bits of A and B in the 
order from the most-significant bit to the least-significant bit. The first bit-position, k, at 
which a, and b, differ determines whether A is less than or greater than B. If a, = 0 and 
b, = 1, then A < B. But if a, = 1 and b, = 0, then A > B. The Ag#B output is defined by 


AgtB = a3b3 + i3a2b> + izinayb, + izini;agD0 
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(a) Code converter (b) 7-segment display 
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(c) Truth table 


Figure 4.21 = A hex-to-7-segment display code converter. 


The i, signals ensure that only the first bits, considered from the left to the right, of A and 
B that differ determine the value of AgtB. 
The AJ/tB output can be derived by using the other two outputs as 


AItB = AeqB + AgtB 


A logic circuit that implements the four-bit comparator circuit is shown in Figure 4.22. This 
approach can be used to design a comparator for any value of n. 

Comparator circuits, like most logic circuits, can be designed in different ways. Another 
approach for designing a comparator circuit is presented in Example 3.9 in Chapter 3. 
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AeqB 


AgtB 


Figure 4.22 =A four-bit comparator circuit. 


| 4.6 VERILOG FOR COMBINATIONAL CIRCUITS 


Having presented a number of useful building block circuits, we will now consider how 
such circuits can be described in Verilog. Rather than using gates or logic expressions, 
we will specify the circuits in terms of their behavior. We will also give a more rigorous 
description of previously used behavioral Verilog constructs and introduce some new ones. 


4.6.1 THE CONDITIONAL OPERATOR 


In a logic circuit it is often necessary to choose between several possible signals or values 
based on the state of some condition. A typical example is a multiplexer circuit in which 
the output is equal to the data input signal chosen by the valuation of the select inputs. For 
simple implementation of such choices Verilog provides a conditional operator (?:) which 
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assigns one of two values depending on a conditional expression. It involves three operands 
used in the syntax 


conditional_expression ? true_expression : false_expression 


If the conditional expression evaluates to | (true), then the value of true_expression is 
chosen; otherwise, the value of false_expression is chosen. For example, the statement 


A=(B <C)?(D+5): (D+ 2); 


means that if B is less than C, the value of A will be D+ 5, or else A will have the 
value D + 2. We used parentheses in the expression to improve readability; they are not 
necessary. The conditional operator can be used both in continuous assignment statements 
and in procedural statements inside an always block. 


A 2-to-1 multiplexer can be defined using the conditional operator in an assign statement 
as shown in Figure 4.23. The module, named mux2to1, has the inputs wo, w;, and s, and 
the output f. The signal s is used for the selection criterion. The output f is equal to w, if 
the select input s has the value 1; otherwise, f is equal to wo. Figure 4.24 shows how the 
same multiplexer can be defined by using the conditional operator inside an always block. 
The same approach can be used to define the 4-to-1 multiplexer from Figure 4.2. As 
seen in the truth table in Figure 4.2), if the select input s; = 1, then fis set to either w2 or 
w3 based on the value of so. Similarly, if s; = 0, then fis set to either wo or w;. Figure 4.25 
shows how nested conditional operators can be used to define this function. The module 
is called mux4tol. Its select inputs are represented by the two-bit vector S. The first 
conditional expression tests the value of bit s,. If s; = 1, then so is tested and fis set to w3 
if so = | and fis set to w2 if so = 0. This corresponds to the third and fourth rows of the 
truth table in Figure 4.2b. Similarly, if s; = 0 the conditional operator on the right chooses 
f =w1 if so = 1 and f = wo if so = 0, thus realizing the first two rows of the truth table. 
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Example 4.8 


module mux2tol (w0, wl, s, f); 
input w0, wl, s; 
output f; 
assign f=s? wl: w0; 


endmodule 


Figure 4.23 = A 2-to-1 multiplexer specified using the 
conditional operator. 
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module mux2tol (w0, wl, s, f); 
input w0, wl, s; 
output reg f; 


always @(w0, wl, s) 
f=s? wl: wo; 


endmodule 


Figure 4.24 = An alternative specification of a 2-to-1 
multiplexer using the conditional operator. 


module mux4tol (w0, wl, w2, w3, S, f); 
input w0, wl, w2, w3; 
input [1:0] S; 
output f; 
assign f = S[1] ? (S[0] ? w3 : w2): (S[0] ? wl : w0); 


endmodule 


Figure 4.25 = A 4-to-1 multiplexer specified using the conditional operator. 


4.6.2 Tue Ir-ELSE STATEMENT 


We have already used the if-else statement in previous chapters. It has the syntax 


if (conditional_expression) statement; 
else statement; 


The conditional expression may use the operators given in Table A.1. If the expression 
is evaluated to true then the first statement (or a block of statements delineated by begin 
and end keywords) is executed, or else the second statement (or a block of statements) is 
executed. 


Example 4.9 Figure 4.26 shows how the if-else statement can be used to describe a 2-to-1 multiplexer. 
The if clause states that f is assigned the value of wo when s = 0. Else, f is assigned the 
value of w,. 
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module mux2tol (w0, wl, s, f); 
input w0, wl, s; 
output reg f; 


always @(w0, wl, s) 


if (s == 0) 
f = w0O; 
else 
f=wl; 
endmodule 


Figure 4.26 = Code for a 2-to-1 multiplexer using the 
if-else statement. 


The if-else statement can be used to implement larger multiplexers. A4-to-1 multiplexer 
is shown in Figure 4.27. The if-else clauses set f to the value of one of the inputs wo, ..., w3, 
depending on the valuation of S. 

Another way of defining the same circuit is presented in Figure 4.28. In this case, a 
four-bit vector W is defined instead of single-bit signals wo, w1, w2, and w3. Also, the four 
different values of S are specified as decimal rather than binary numbers. 


Figure 4.4 shows how a 16-to-1 multiplexer can be built by using five 4-to-1 multiplexers. Example 4.10 
Figure 4.29 presents Verilog code for this circuit using five instantiations of the mux4tol 
module. The data inputs to the mux/6to] module are the 16-bit vector W, and the select 
inputs are the four-bit vector S. In the Verilog code signal names are needed for the outputs 
of the four 4-to-1 multiplexers on the left of Figure 4.4. A four-bit signal named M is used 
for this purpose. The first multiplexer instantiated, Mux/, corresponds to the multiplexer 
at the top left of Figure 4.4. Its first four ports are driven by the signals W[O], ..., W[3]. 
The syntax S[1:0] is used to attach the signals S[1] and S[O] to the two-bit S port of the 
mux4tol module. The M [0] signal is connected to the multiplexer’s output port. Similarly, 
Mux2, Mux3, and Mux4 are instantiations of the next three multiplexers on the left. The 
multiplexer on the right of Figure 4.4 is instantiated as Mux5. The signals M [0], ..., M[3] 
are connected to its data inputs, and bits $[3] and S[2] are attached to the select inputs. The 
output port generates the mux/6tol output f. Compiling the code results in the multiplexer 
function 


f = 835251SqWo + $38251SoW1 + 535251SqW2 +--+ + $38251SoW14 + $35251S0W15 


Since the mux4tol module is being instantiated in the code of Figure 4.29, it is nec- 
essary to either include the code of Figure 4.28 in the same file as the mux/]6tol module 
or place the mux4to] module in a separate file in the same directory, or a directory with a 
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module mux4tol (w0, wl, w2, w3, S, f); 
input w0, wl, w2, w3; 
input [1:0] S; 
output reg f; 


always @(*) 

if (S == 2’b00) 
f=w0; 

else if (S == 2’b01) 
f=wl; 

else if (S == 2’b10) 
f=w2; 

else 
f=w3; 


endmodule 


Figure 4.27 = Code for a 4-to-1 multiplexer using the if-else 
statement. 


module mux4tol (W, S, f); 
input [0:3] W; 
input [1:0] S; 
output reg f; 


always @(W, S) 


if (S == 0) 
f= W/[0]; 
else if (S == 1) 
f= W[1]; 
else if (S == 2) 
f= W[2]; 
else 
f = W[3]; 
endmodule 


Figure 4.28 = Alternative specification of a 4-to-1 
multiplexer. 


specified path so that the Verilog compiler can find it. Observe that if the code in Figure 4.27 
were used as the required mux4tol module, then we would have to list the ports separately, 
as in W[0], W[1], W[2], W[3], rather than as the vector W[0:3]. 
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module mux16tol (W, S, f); 
input [0:15] W; 
input [3:0] S; 
output f; 
wire [0:3] M; 


mux4tol Mux! (W[0:3], S[1:0], M[0]); 
mux4tol Mux2 (W[4:7], S[1:0], M[1]); 
mux4tol Mux3 (W[8:11], S[1:0], M[2]); 
mux4tol Mux4 (W[12:15], S[1:0], M[3]); 
mux4tol Mux5 (M[0:3], S[3:2], f); 


endmodule 


Figure 4.29 — Hierarchical code for a 16-to-1 multiplexer. 


4.6.3 THE CASE STATEMENT 


The if-else statement provides the means for choosing an alternative based on the value of 
an expression. When there are many possible alternatives, the code based on this statement 
may become awkward to read. Instead, it is often possible to use the Verilog case statement 
which is defined as 


case (expression) 
alternativel: statement; 
alternative2: statement; 


alternativej: statement; 
[default: statement;] 
endcase 


The value of the controlling expression and each alternative are compared bit by bit. When 
there is one or more matching alternative, the statement(s) associated with the first match 
(only) is executed. When the specified alternatives do not cover all possible valuations of 
the controlling expression, the optional default clause should be included. Otherwise, the 
Verilog compiler will synthesize memory elements to deal with the unspecified possibilities; 
we will discuss this issue in Chapter 5. 


The case statement can be used to define a 4-to-1 multiplexer as shown in Figure 4.30. The Example 4.11 
four values that the select vector S can have are given as decimal numbers, but they could 
also be given as binary numbers. 
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module mux4tol (W, S, f); 
input [0:3] W; 
input [1:0] S; 
output reg f; 


always @(W, S) 


case (S) 
0: f= W[0]; 
1: f=W[1]; 
2: f= W/[2]; 
3: f= W[3]; 
endcase 
endmodule 


Figure 4.30 =A 4-to-1 multiplexer defined using the 
case statement. 


Example 4.12 


Figure 4.31 shows how a case statement can be used to describe the truth table for a 2-to-4 
binary decoder. The module is called dec2to4. The data inputs are the two-bit vector W, 
and the enable input is En. The four outputs are represented by the four-bit vector Y. 

In the truth table for the decoder in Figure 4.14a, the inputs are listed in the order 
En w, Wo. To represent these three signals in the controlling expression, the Verilog code 
uses the concatenate operator to combine the En and W signals into a three-bit vector. The 
four alternatives in the case statement correspond to the truth table in Figure 4.14a where 
En = 1, and the decoder outputs have the same patterns as in the first four rows of the 
truth table. The last clause uses the default keyword and sets the decoder outputs to 0000, 
because it represents all other cases, namely those where En = 0. 


Example 4.13 


The 2-to-4 decoder can be specified using a combination of if-else and case statements as 
given in Figure 4.32. If En = 0, then all four bits of the output Y are set to the value 0, else 
the case alternatives are evaluated if En = 1. 


Example 4.14 


The tree structure of the 4-to-16 decoder in Figure 4.16 can be defined as shown in Figure 
4.33. The inputs are a four-bit vector W and an enable signal En. The outputs are represented 
by the 16-bit vector Y. The circuit uses five instances of the 2-to-4 decoder defined in either 
Figure 4.31 or 4.32. The outputs of the left-most decoder in Figure 4.16 are denoted as the 
four-bit vector M in Figure 4.33. 
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module dec2to4 (W, En, Y); 
input [1:0] W; 
input En; 
output reg [0:3] Y; 


always @(W, En) 
case ({En, W}) 
3’b100: Y = 4’b1000; 
3’b101: Y = 4’b0100; 
3’b110: Y = 4’b0010; 
3’b111: Y =4’b0001; 
default: Y = 4’b0000; 
endcase 


endmodule 


Figure 4.31 Verilog code for a 2-to-4 binary decoder. 


module dec2to4 (W, En, Y); 
input [1:0] W; 
input En; 
output reg [0:3] Y; 


always @(W, En) 
begin 
if (En == 0) 
Y = 4’b0000; 
else 
case (W) 
0: Y = 4’b1000; 
1: Y = 4’b0100; 
2: Y = 4’b0010; 
3: Y = 4’b0001; 
endcase 
end 


endmodule 


Figure 4.32 — Alternative code for a 2-to-4 binary 
decoder. 
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module dec4to16 (W, En, Y); 
input [3:0] W; 
input En; 
output [0:15] Y; 
wire [0:3] M; 


dec2to4 Decl (W[3:2], M[0:3], En); 
dec2to4 Dec2 (W[1:0], Y[0:3], M[0]); 
dec2to4 Dec3 (W[1:0], Y[4:7], M[1]); 
dec2to4 Dec4 (W[1:0], Y[8:11], M[2]); 
dec2to4 Dec5 (W[1:0], Y[12:15], M[3]); 


endmodule 


Figure 4.33 Verilog code for a 4-to-16 decoder. 


Example 4.15 


Another example of a case statement is given in Figure 4.34. The module, seg7, represents 
the hex-to-7-segment decoder in Figure 4.21. The hexadecimal input is the four-bit vector 
named hex, and the seven outputs are the seven-bit vector named leds. The case alternatives 
are listed so that they resemble the truth table in Figure 4.21c. Note that there is a comment 
to the right of the case statement, which labels the seven outputs with the letters from a 
to g. These labels indicate to the reader the correlation between the bits of the /eds vector 
in the Verilog code and the seven segments in Figure 4.21b. 


Example 4.16 


Anarithmetic logic unit (ALU) is a logic circuit that performs various Boolean and arithmetic 
operations on n-bit operands. Table 4.1 specifies the functionality of a simple ALU, known 
as the 74381 chip, which has been available in the form of a standard chip in the family 
called the 7400-series. This ALU has 2 four-bit data inputs, A and B, a three-bit select 
input, S, and a four-bit output, F. As the table shows, F is defined by various arithmetic or 
Boolean operations on the inputs A and B. In this table + means arithmetic addition, and 
— means arithmetic subtraction. To avoid confusion, the table uses the words XOR, OR, 
and AND for the Boolean operations. Each Boolean operation is done in a bitwise fashion. 
For example, F = A AND B produces the four-bit result fp = agbo, fi = ahi, fo = arba, 
and f3 = a3b3. 

Figure 4.35 shows how the functionality of the 74381 ALU can be described in Verilog 
code. The case statement shown corresponds directly to Table 4.1. 


The Casex and Casez Statements 

Logic circuits that we have considered so far operate using the logic values 0 and 1. 
When specifying the functionality of such circuits in the form of a truth table, we sometimes 
encounter cases where it does not matter whether a given logic variable has the value 0 or 
1, as seen in Figures 4.14a and 4.20. It is customary to use the letter x to denote such cases, 
where x represents an unknown value. 
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module seg7 (hex, leds); 
input [3:0] hex; 
output reg [1:7] leds; 


always @ (hex) 
case (hex) //abcdefg 
0: leds = 7’b1111110; 
: leds = 7’b0110000; 
: leds = 7’b1101101; 
: leds = 7’b1111001; 
: leds = 7’b0110011; 
: leds = 7°b1011011; 
: leds = 7°b1011111; 
: leds = 7’b1110000; 
: leds = 7’b1111111; 
: leds = 7°b1111011; 
10: leds = 7’b1110111; 
11: leds = 7’b0011111; 
12: leds = 7’b1001110; 
13: leds = 7’b0111101; 
14: leds = 7’b1001111; 
15: leds = 7’b1000111; 
endcase 


OMDANNDMNFWN KF 


endmodule 


Figure 4.34 — Code for a hex-to-7-segment decoder. 


Table 4.1 The functionality 


of the 74381 
ALU. 
Inputs Outputs 

Operation S251 So F 
Clear 000 0000 
B-A 001 B-A 
A-B 010 A-B 
ADD On Ae 
XOR 100 A XORB 
OR 101 AORB 


AND 110 AAND B 
Preset Poy 1 es i Ts) | 
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// 74381 ALU 

module alu (S, A, B, F); 
input [2:0] S; 
input [3:0] A, B; 
output reg [3:0] F; 


always @(S, A, B) 


case (S) 
0: F = 4’b0000; 
1: F=B- A; 
2: F=A— B; 
3: F=A+B; 
4: F=A% B; 
5: F=A|B; 
6: F=A XB; 
7. F=4bl111; 

endcase 

endmodule 


Figure 4.35 = Code that represents the functionality of 
the 74381 ALU chip. 


It is also possible to implement circuits that can produce three different types of output 
signals. In addition to the usual 0 and 1 values, there is a third value that indicates that the 
output line is not connected to any defined voltage level. In this state the output behaves like 
an open circuit, as explained in Appendix B. We say that the output is in the high-impedance 
state, which is usually denoted by using the letter z. 

In Verilog, a signal can have four possible values: 0, 1, z, or x. The z and x values can 
also be denoted by the capital letters Z and X. In the case statement it is possible to use 
the logic values 0, 1, z, and x in the case alternatives. A bit-by-bit comparison is used to 
determine the match between the expression and one of the alternatives. 

Verilog provides two variants of the case statement that treat the z and x values in 
a different way. The casez statement treats all z values in the case alternatives and the 
controlling expression as don’t cares. The casex statement treats all z and x values as don’t 
cares. 


Example 4.17 


Figure 4.36 gives Verilog code for the priority encoder defined in Figure 4.20. The desired 
priority scheme is realized by using a casex statement. The first alternative specifies that 
the output is set to yj yo = 3 if the input w3 is 1. This assignment does not depend on the 
values of inputs w2, w1, or wo; hence their values do not matter. The other alternatives in 
the casex statement are evaluated only if w3 = 0. The second alternative states that if w2 
is 1, then yjyo = 2. If w2 = 0, then the next alternative results in y;yo = 1 if w; = 1. If 
W3 = W2 = W; = 0 and wo = 1, then the fourth alternative results in yj yo = 0. 
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module priority (W, Y, z); 
input [3:0] W; 
output reg [1:0] Y; 
output reg z; 


always @(W) 


begin 
Z= 1k 
casex (W) 
4’blxxx: Y =3; 
4’bO1xx: Y = 2; 
4’b001x: Y= 1; 
4’b0001: Y = 0; 
default: begin 
z=0; 
Y =2’bx; 
end 
endcase 
end 
endmodule 


Figure 4.36 Verilog code for a priority encoder. 


The priority encoder’s output z must be set to 1 whenever at least one of the data inputs 
is 1. This output is set to 1 outside the casex statement in the always block. If none of the 
four alternatives matches the value of W, then the default clause overrides the value of z 
and sets it to 0. The default clause also indicates that the Y output can be set to any pattern 
because it will be ignored. 


221 


4.6.4 THE For Loop 


If the structure of a desired circuit exhibits a certain regularity, it may be convenient to 
define the circuit using a for loop. We introduced the for loop in Section 3.5.4, where it 
was useful in a generic specification of a ripple-carry adder. The for loop has the syntax 


for (initial_index; terminal_index; increment) statement; 


A loop control variable, which has to be of type integer, is set to the value given as the 
initial index. It is used in the statement or a block of statements delineated by begin and end 
keywords. After each iteration, the control variable is changed as defined in the increment. 
The iterations end after the control variable has reached the terminal index. 
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Unlike for loops in high-level programming languages, the Verilog for loop does not 
specify changes that take place in time through successive loop iterations. Instead, during 
each iteration it specifies a different subcircuit. In Figure 3.25 the for loop was used to 
define a cascade of full-adder subcircuits to form an n-bit ripple-carry adder. The for loop 
can be used to define many other structures as illustrated by the next two examples. 


Example 4.18 


Figure 4.37 shows how the for loop can be used to specify a 2-to-4 decoder circuit. The 
effect of the loop is to repeat the if-else statement four times, for k = 0,...,3. The first 
loop iteration sets yp = 1 if W = O and En = 1. Similarly, the other three iterations set the 
values of y;, y2, and y3 according to the values of W and En. 

This arrangement can be used to specify a large n-to-2” decoder simply by increasing 
the sizes of vectors W and Y accordingly, and making n — | be the terminal index value 
of k. 


Example 4.19 


The priority encoder of Figure 4.20 can be defined by the Verilog code in Figure 4.38. In 
the always block, the output bits y; and yo are first set to the don’t-care state and z is cleared 
to 0. Then, if one or more of the four inputs w3, ..., wo is equal to 1, the for loop will set 
the valuation of y;y9 to match the index of the highest priority input that has the value 1. 
Note that each successive iteration through the loop corresponds to a higher priority. Verilog 
semantics specify that a signal that receives multiple assignments in an always block retains 
the last assignment. Thus the iteration that corresponds to the highest priority input that is 
equal to 1 will override any setting of Y established during the previous iterations. 


module dec2to4 (W, En, Y); 
input [1:0] W; 
input En; 
output reg [0:3] Y; 
integer k; 


always @(W, En) 
for (k =0;k<=3;k=k+1) 
if ((W == k) && (En == 1)) 


Y[k] = 1; 
else 
Y[k] =0; 
endmodule 


Figure 4.37 — A 2-to-4 binary decoder specified using the for 
loop. 
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module priority (W, Y, z); 
input [3:0] W; 
output reg [1:0] Y; 
output reg z; 
integer k; 


always @(W) 
begin 
Y = 2’bx; 
z=0; 
for (k=0;k < 4;k =k+1) 
if (W[k]) 
begin 
Y=k; 
z=1; 
end 
end 


endmodule 


Figure 4.38 — A priority encoder specified using the for 
loop. 


4.6.5 VERILOG OPERATORS 


In this section we discuss the Verilog operators that are useful for synthesizing logic circuits. 
Table 4.2 lists these operators in groups that reflect the type of operation performed. A more 
complete listing of the operators is given in Table A.1. 

To illustrate the results produced by the various operators, we will use three-bit vectors 
A[2:0], B[2:0], and C[2:0], as well as scalars f and w. 


Bitwise Operators 
Bitwise operators operate on individual bits of operands. The ~ operator forms the 1’s 
complement of the operand such that the statement 


C=~A; 


produces the result cz = a2, cj = a, and cp = Go, where q; and c; are the bits of the vectors 
Aand C. 
Most bitwise operators operate on pairs of bits. The statement 


C=A S&B; 


generates c2 = a) - bo, cy = a, - by, and cp = ag - bo. Similarly, the | and “ operators per- 
form bitwise OR and XOR operations. The “~ operator, which can also be written as ~”, 
produces the XNOR such that 
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Table 4.2 Verilog operators. 


Operator type Operator symbols Operation performed Number of operands 

Bitwise ~ 1’s complement 1 
& Bitwise AND 2 
| Bitwise OR 2) 
A Bitwise XOR 2 
NON Bitwise XNOR 2 
Logical ! NOT 1 
&& AND 2 
\| OR 2 
Reduction & Reduction AND 1 
~& Reduction NAND 1 
| Reduction OR 1 
|| Reduction NOR 1 
A Reduction XOR 1 
IN ye NS Reduction XNOR 1 
Arithmetic + Addition ) 
- Subtraction 2 
- 2’s complement 1 
* Multiplication 2 
/ Division 2 
Relational = Greater than 2; 
< Less than 2) 
— Greater than or equal to 2 
= Less than or equal to % 
Equality == Logical equality 2 
i Logical inequality 2 
Shift SS Right shift 2 
<< Left shift 2 

Concatenation tte} Concatenation Any number 

Replication {{}} Replication Any number 
Conditional He Conditional 3 

C=A~‘B; 


gives Co = ao @ bo, c) = ay @ Dj, and co = ao © bo. If the operands are of unequal size, 
then the shorter operand is extended by padding 0s to the left. 

A scalar function may be assigned a value as a result of a bitwise operation on two 
vector operands. In this case, it is only the least-significant bits of the operands that are 
involved in the operation. Hence the statement 


f=A%’B; 


yields f = ao @ Do. 
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&{|0O 1 x | |O 1 x 
0|;0 0 O 0o;0O 1 x 
1;0 1 x 1/1 1 1 
x |O x x x{/x 1 x 
S'O 1 x ~“'0 1 x 
0;0 1 x 0 1 OO x 
1|/1 0 x 1 Oo tl x 
x |x x x x x xX xX 


Figure 4.39 Truth tables for bitwise operators. 


The bitwise operations may involve operands that include the unknown logic value x. 
Then the operations are performed according to the truth tables in Figure 4.39. For example, 
if P= 4’b101x and Q = 4’b1001, then P & Q = 4’b100x while P| Q=4’b1011. 


Logical Operators 
The ! operator has the same effect on a scalar operand as the ~ operator. Thus, f = !w 
= ~w. But the effect on a vector operand is different, namely if 


f=!A; 
then f will be equal to | (true) only if all bits of A are equal to 0 (false). Hence, f = 


an +a, + ao. 
The && operator implements the AND operation such that 


f=A&& B; 
produces f = (a2 + a; + do) - (bo + db} + bo). Similarly, using the || operator in 
f=A]||B; 
gives f = (a + a; + ao) + (bz +b; + bo). 


Reduction Operators 
The reduction operators perform an operation on the bits of a single vector operand 
and produce a one-bit result. Using the & operator in 


f= &A; 
produces f = a2 - a; - ao. Similarly, 
f=%*A; 


gives f = ad) ® a; ® do, and so on. 
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Arithmetic Operators 


We have already encountered the arithmetic operators in Chapter 3. They perform 
standard arithmetic operations. Thus 


C=A+B; 
puts the three-bit sum of A plus B into C, while 
C=A-—B; 
puts the difference of A and B into C. The operation 
C=-A; 


places the 2’s complement of A into C. 

The addition, subtraction, and multiplication operations are supported by most CAD 
synthesis tools. However, the division operation is often not supported. When the Verilog 
compiler encounters an arithmetic operator, it usually synthesizes it by using an appropriate 
module from a library. 


Relational Operators 

The relational operators are typically used as conditions in if-else and for statements. 
These operators have the same meaning as the corresponding operators in the C program- 
ming language. An expression that uses the relational operators returns the value | if it is 
evaluated as true, and the value 0 if evaluated as false. If there are any x (unknown) or z 
bits in the operands, then the expression takes the value x. 


Example 4.20 


The use of relational operators in the if-else statement is illustrated in Figure 4.40. The 
defined circuit is the four-bit comparator described in Section 4.5. 


Equality Operators 

The expression (A == B) is evaluated as true if A is equal to B and false otherwise. The 
!= operator has the opposite effect. The result is ambiguous (x) if either operand contains 
x or z values. 


Shift Operators 


A vector operand can be shifted to the right or left by a number of bits specified as a 
constant. When bits are shifted, the vacant bit positions are filled with Os. For example, 


B=A<<\l1; 
results in by = a1, bj = ao, and bo = 0. Similarly, 
B=A>>2; 


yields by = b; = 0 and bo = a. 
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module compare (A, B, AeqB, AgtB, AItB); 
input [3:0] A, B; 
output reg AeqB, AgtB, AItB; 


always @(A, B) 
begin 
AegB = 0; 
AgtB = 0; 
AItB = 0; 
if (A == B) 
AeqB = 1; 
else if (A > B) 
AgtB = 1; 
else 
AltB = 1; 
end 


endmodule 


Figure 4.40 Verilog code for a four-bit comparator. 


Concatenate Operator 
This operator concatenates two or more vectors to create a larger vector. For example, 


D = {A, B}; 
defines the six-bit vector D = a2a,agb2b, bo. Similarly, the concatenation 
E= {3’b111, A, 2’b00}; 


produces the eight-bit vector EF = 111la2a,aq00. 


Replication Operator 

This operator allows repetitive concatenation of the same vector, which is replicated the 
number of times indicated in the replication constant. For example, {3{A}} is equivalent to 
writing {A, A, A}. The specification {4{2’b10}} produces the eight-bit vector 10101010. 

The replication operator may be used in conjunction with the concatenate operator. For 
instance, {2{A}, 3{B}} is equivalent to {A, A, B, B, B}. We introduced the concatenate 
and replication operators in Sections 3.5.6 and 3.5.8, respectively, and illustrated their use 
in specifying the adder circuits. 
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Table 4.3 Precedence of Verilog operators. 
Operator type Operator symbols Precedence 
Complement Re Highest precedence 
Arithmetic x | 

+ — 
Shift << >> 
Relational QB SS S= 
Equality == |= 
Reduction & ~& 

A ou” 

ical 
Logical &E& 

ll 

Conditional ve Lowest precedence 


Conditional Operator 
The conditional operator is discussed fully in Section 4.6.1. 


Operator Precedence 

The Verilog operators are assumed to have the precedence indicated in Table 4.3. 
The order of precedence is from top to bottom; operators in the top row have the highest 
precedence and those in the bottom row have the lowest precedence. The operators listed 
in the same row have the same precedence. 

The designer can use parentheses to change the precedence of operators in Verilog code 
or remove any possible misinterpretation. It is a good practice to use parentheses to make 
the code unambiguous and easy to read. 


4.6.6 THE GENERATE CONSTRUCT 


In Section 3.5.4 we introduced the generate loop capability which can be used to create 
multiple instances of subcircuits. A subcircuit may be defined in a block of statements 
delineated by the generate and endgenerate keywords. The subcircuit is instantiated 
multiple times using a generate-index variable. This variable is defined using the genvar 
keyword and it can have only positive integer values. It is not possible to use an index 
declared as a normal integer variable. 


Example 4.21 


Figure 4.41 shows how the generate construct can be used to specify an n-bit ripple-carry 
adder. The subcircuit is a full-adder defined structurally in terms of primitive gates as 
introduced in Figure 3.18. The for loop causes the full-adder block to be instantiated n 
times. 
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module addern (carryin, X, Y, S, carryout); 
parameter n= 32; 
input carryin; 
input [n—1:0] X, Y; 
output [n—1:0] S; 
output carryout; 
wire [n:0] C; 


genvar k; 
assign C[0] = carryin; 
assign carryout = C[n]; 
generate 
for (k =0;k < nj; k=k+1) 
begin: fulladd_stage 
wire zl, z2, z3; //wires within full-adder 
xor (S[k], X[k], Y[k], C[k]); 
and (zl, X[k], Y[{k]); 
and (z2, X[k], C[k]); 
and (z3, Y[k], C[k]); 
or (C[k+1], zl, z2, z3); 
end 
endgenerate 


endmodule 


Figure 4.41 —_ Using the generate loop to define an n-bit 
ripple-carry adder. 


In this example, the for statement is used in the generate block to control the selection 
of the generated objects. The generate block can also contain if-else and case statements 
to determine which objects are generated. 
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4.6.7 TASKS AND FUNCTIONS 


In high-level programming languages it is possible to use subroutines and functions to 
avoid replicating specific routines that may be needed in several places of a given program. 
Verilog provides similar capabilities, known as tasks and functions. They can be used to 
modularize large designs and make the Verilog code easier to understand. 


Verilog Task 


A task is declared by the keyword task and it comprises a block of statements that ends 
with the keyword endtask. The task must be included in the module that calls it. It may 
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have input and output ports. These are not the ports of the module that contains the task, 
which are used to make external connections to the module. The task ports are used only 
to pass values between the module and the task. 


Example 4.22 


In Figure 4.29 we showed the Verilog code for a 16-to-1 multiplexer that instantiates five 
copies of a 4-to-1 multiplexer circuit given in a separate module named mux4tol. The same 
circuit can be specified using the task approach as shown in Figure 4.42. Observe the key 
differences. The task mux4to/ is included in the module mux/6to/. It is called from an 
always block by means of an appropriate case statement. The output of a task must be a 
variable, hence g is of reg type. 


module mux16tol (W, S16, f); 
input [0:15] W; 
input [3:0] S16; 
output reg f; 


always @(W, S16) 
case (S16[3:2]) 
0: mux4tol (W 
1: mux4tol (W 
2: mux4tol (W 
3: mux4tol (W 
endcase 


0:3], S16[1:0], £); 
4:7], S16[1:0], f); 
8:11], S16[1:0], f); 
12:15], S16[1:0], f); 


(ol reel ren aoe 


// Task that specifies a 4-to-1 multiplexer 
task mux4tol; 

input [0:3] X; 

input [1:0] S4; 

output reg g; 


case (S4) 
0: g = X[0]; 
1: g=X(1]; 
2: g=X[2]; 
3: g = X[3]; 
endcase 
endtask 
endmodule 


Figure 4.42 — Use of a task in Verilog code. 
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Verilog Function 

A function is declared by the keyword function and it comprises a block of statements 
that ends with the keyword endfunction. The function must have at least one input and it 
returns a single value that is placed where the function is invoked. 


Figure 4.43 shows how the code in Figure 4.42 can be written to use a function. The Verilog Example 4.23 
compiler essentially inserts the body of the function at each place where it is called. Hence 
the clause 


0: f = mux4tol (W[0:3], S16[1:0]); 


becomes 


module mux16tol (W, S16, f); 
input [0:15] W; 
input [3:0] S16; 
output reg f; 


// Function that specifies a 4-to-1 multiplexer 
function mux4tol; 

input [0:3] X; 

input [1:0] S4; 


case (S4) 
0: mux4tol = X[0]; 
1: mux4tol = X[1]; 
2: mux4tol = X[2]; 
3: mux4tol = X[3]; 
endcase 
endfunction 


always @(W, S16) 
case (S16[3:2]) 
0: f= mux4tol (W[0:3], S16[1:0]); 
1: f = mux4tol (W[4:7], S16[1:0]); 
2: f = mux4tol (W[8:11], S16[1:0]); 
3: f = mux4tol (W[12:15], S16[1:0]); 
endcase 


endmodule 


Figure 4.43 — The code from Figure 4.42 using a function. 
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0: case (S16[1:0]) 


0: f = W[O]; 
1: f=W[1]; 
2: f= W[2]; 
3: f= W[3]; 
endcase 


The function serves as a convenience that makes the mux/6to] module more compact. 


A Verilog function can invoke another function but it cannot call a Verilog task. A task 
may call another task and it may invoke a function. In Figure 4.42 we defined the task after 
the always block that calls it. In contrast, in Figure 4.43 we defined the function before 
the always block that invokes it. Both possibilities are allowed in the Verilog standard for 
both tasks and functions. However, some tools require functions to be defined before the 
statements that invoke them. 


4.7 CONCLUDING REMARKS 


This chapter has introduced a number of circuit building blocks. Examples using these 
blocks to construct larger circuits will be presented in later chapters. To describe the 
building block circuits efficiently, several Verilog constructs have been introduced. In 
many cases a given circuit can be described in various ways, using different constructs. A 
circuit that can be described using an if-else statement can also be described using a case 
statement or perhaps a for loop. In general, there are no strict rules that dictate when one 
style should be preferred over another. With experience the user develops a sense for which 
types of statements work well in a particular design situation. Personal preference also 
influences how the code is written. 

Verilog is not a programming language, and Verilog code should not be written as if it 
were a computer program. The statements discussed in this chapter can be used to create 
large, complex circuits. A good way to design such circuits is to construct them using well- 
defined modules, in the manner that we illustrated for the multiplexers, decoders, encoders, 
and so on. Additional examples using the Verilog statements introduced in this chapter are 
given in Chapters 5 and 6. In Chapter 7 we provide a number of examples of using Verilog 
code to describe larger digital systems. For more information on Verilog, the reader can 
consult more specialized books [2-8]. 

In the next chapter we introduce logic circuits that include the ability to store logic 
signal values in memory elements. 


4.8 EXAMPLES OF SOLVED PROBLEMS 
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This section presents some typical problems that the reader may encounter, and shows how 
such problems can be solved. 


Problem: Implement the function f (w1, w2, w3) = >> m(0, 1, 3, 4, 6, 7) by using a 3-to-8 
binary decoder and an OR gate. 


Solution: The decoder generates a separate output for each minterm of the required function. 
These outputs are then combined in the OR gate, giving the circuit in Figure 4.44. 


Example 4.24 


Problem: Derive a circuit that implements an 8-to-3 binary encoder. 


Solution: The truth table for the encoder is shown in Figure 4.45. Only those rows for 
which a single input variable is equal to 1 are shown; the other rows can be treated as don’t 
care cases. From the truth table it is seen that the desired circuit is defined by the equations 

yo = Wa + ws + Wo + W7 

Yi = W2 + W3 + Wo + W7 

Yo = Wi + W3 + Ws + W7 


Example 4.25 


Problem: Implement the function 
SF (W1, W2, W3, Wa, W5) = W1W2W4Ws + WW + Wiw3 + WiW4 + W3W4Ws 


by using a 4-to-1 multiplexer and as few other gates as possible. Assume that only the 
uncomplemented inputs w,, w2, w3, w4, and ws are available. 


Figure 4.44 = Circuit for Example 4.24. 


Example 4.26 
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Figure 4.45 Truth table for an 8-to-3 binary encoder. 


Figure 4.46 = Circuit for Example 4.26. 


Solution: Since variables w; and w4 appear in more product terms in the expression for 
f than the other three variables, let us perform Shannon’s expansion with respect to these 
two variables. The expansion gives 


7 = W1 Walt, wy, + W1Waftw, + wi Wafww, + Wi Wahi, ws 
= W1W4(W2Ws) + Wiwa(w3ws) + W1W4(W2 + W3) + WiWa4(1) 


We can use a NOR gate to implement w.ws = w2 + ws. We also need an AND gate and an 
OR gate. The complete circuit is presented in Figure 4.46. 
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by by bo &2 8; 8&8 


0 0 O 0 0 O 
0 0 1 0 0 1 
0 1 O 0 1 1 
0 1 1 0 1 O 
1 0 O 1 1 O 
1 0 1 1 1 1 
1 1 O 1 0 1 
1 1 1 1 0 0 


Figure 4.47 _ Binary to Gray code coversion. 


Problem: In Chapter 2 we pointed out that the rows and columns of a Karnaugh map Example 4.27 
are labeled using Gray code. This is a code in which consecutive valuations differ in one 

variable only. Figure 4.47 depicts the conversion between three-bit binary and Gray codes. 

Design a circuit that can convert a binary code into Gray code according to the figure. 


Solution: From the figure it follows that 
82 = by 
1 = bby + bby 
=b @bhy 
80 = bob; + bob 
=h eh 


Problem: In Section 4.1.2 we showed that any logic function can be decomposed using Example 4.28 
Shannon’s expansion theorem. For a four-variable function, f (w,, ..., w4), the expansion 
with respect to w, is 


fi, ..., wa) = Wifs, + wif, 


A circuit that implements this expression is given in Figure 4.48a. 

(a) If the decomposition yields fj, = 0, then the multiplexer in the figure can be replaced 
by a single logic gate. Show this circuit. 

(b) Repeat part (a) for the case where f,,, = 1. 


Solution: The desired circuits are shown in parts (b) and (c) of Figure 4.48. 
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(c) Solution for part b. 


Figure 4.48 — Circuits for Example 4.28. 


Example 4.29 Problem: In Section 2.17 we said that field-programmable gate arrays (FPGAs) contain 
lookup tables (LUTs) that are used to implement logic functions. Each LUT can be pro- 
grammed to implement any logic function of its inputs. FPGAs are discussed in detail in 
Appendix B. Many commercial FPGAs contain four-input lookup tables (4-LUTs). What 
is the minimum number of 4-LUTs needed to construct a 4-to-1 multiplexer with select 
inputs s; and so and data inputs w3, w2, w1, and wo? 


Solution: A straightforward attempt is to use directly the expression that defines the 4-to-1 
multiplexer 


f = S1Sqwo + 51S0W1 + S1S0W2 + 51S0W3 


Let g = S1Sqwo + S1SoW1 and h = s)Sow2 + S1SqW3, So that f = g +h. This decomposition 
leads to the circuit in Figure 4.49a, which requires three LUTs. 
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(a) Using three LUTs 


(b) Using two LUTs 


Figure 4.49 — Circuits for Example 4.29. 


When designing logic circuits, one can sometimes come up with a clever idea which 
leads to a superior implementation. Figure 4.49b shows how it is possible to implement 
the multiplexer with just two LUTs, based on the following observation. The truth table in 
Figure 4.2b indicates that when s; = 0 the output must be either wo or w;, as determined 
by the value of so. This can be generated by the first LUT, with the output k. The second 
LUT must make the choice between w2 and w3 when s; = 1. But, the choice can be made 
only by knowing the value of so. Since it is impossible to have five inputs in the LUT, more 
information has to be passed from the first to the second LUT. Observe that when s; = | 
the output f will be equal to either w2 or w3, in which case it is not necessary to know the 
values of wo and w;. Hence, in this case we can pass on the value of sg through the first 
LUT, rather than wo or w;. This can be done by making the function of this LUT 


k= S| (SqWo + SoW}) + AY IPY0) 
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Figure 4.50 A shifter circuit. 


Then, the second LUT performs the function 
f =5ik + si(kw3 + kwa) 


Example 4.30 Problem: In digital systems it is often necessary to have circuits that can shift the bits of 
a vector by one or more bit positions to the left or right. Design a circuit that can shift a 
four-bit vector W = w3w2w Wo one bit position to the right when a control signal Shift is 
equal to 1. Let the outputs of the circuit be a four-bit vector Y = y3y2y, yo and a signal k, 
such that if Shift = 1 then y3 = 0, yo = w3, yj = W2, Yo = Wi, and k = wo. If Shift = 0 
then Y = W andk = 0. 


Solution: The required circuit can be implemented with five 2-to-1 multiplexers as shown 
in Figure 4.50. The Shift signal is used as the select input to each multiplexer. 


Example 4.31 Problem: The shifter circuit in Example 4.30 shifts the bits of an input vector by one bit 
position to the right. It fills the vacated bit on the left side with 0. A more versatile shifter 
circuit may be able to shift by more bit positions at a time. If the bits that are shifted out are 
placed into the vacated positions on the left, then the circuit effectively rotates the bits of 
the input vector by a specified number of bit positions. Such a circuit is often called a barrel 
shifter. Design a four-bit barrel shifter that rotates the bits by 0, 1, 2, or 3 bit positions as 
determined by the valuation of two control signals s; and so. 


Solution: The required action is given in Figure 4.5la. The barrel shifter can be imple- 
mented with four 4-to-1 multiplexers as shown in Figure 4.51b. The control signals s; and 
Sg are used as the select inputs to the multiplexers. 


Example 4.32 Problem: Write Verilog code that represents the circuit in Figure 4.17. Use the dec2to4 
module in Figure 4.31 as a subcircuit in your code. 


Solution: The code is shown in Figure 4.52. Note that the dec2to4 module can be included 
in the same file as we have done in the figure, but it can also be in a separate file in the 
project directory. 
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(a) Truth table 
(b) Circuit 
Figure 4.51 — A barrel shifter circuit. 
Problem: Write Verilog code that represents the shifter circuit in Figure 4.50. Example 4.33 
Solution: One possibility is to specify the structure of this circuit as shown in Figure 4.53. 
The if-else construct is used to define the desired shifting of individual bits. A typical 
Verilog compiler will implement this code with 2-to-1 multiplexers as depicted in Fig- 
ure 4.50. 
An alternative is to make use of the shift operator defined in Section 4.6.5, as indicated 
in Figure 4.54. 
Problem: Write Verilog code that defines the barrel shifter in Figure 4.51. Example 4.34 


Solution: The code in Figure 4.55 is a possible solution. The rotate function is accomplished 
by concatenating two copies of the input vector W and shifting the obtained 8-bit vector to 
the right by the number of bit positions specified as the input S. The four least-significant 


bits of the resulting 8-bit vector are the desired output Y. 
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module mux4tol (W, S, f); 
input [0:3] W; 
input [1:0] S; 
output f; 
wire [0:3] Y; 


dec2to4 decoder (S, 1, Y); 
assign f=|(W & Y); 


endmodule 


module dec2to4 (W, En, Y); 
input [1:0] W; 
input En; 
output reg [0:3] Y; 


always @(W, En) 
case ({En, W}) 
3’b100: Y = 4’b1000; 
3’b101: Y = 4’b0100; 
3’b110: Y = 4’b0010; 
3’bl11: Y = 4’b0001; 
default: Y = 4’b0000; 
endcase 


endmodule 


Figure 4.52 _ Verilog code for Example 4.32. 


Example 4.35 The concept of parity is widely used in digital systems for error-checking purposes. When 
digital information is transmitted from one point to another, perhaps by long wires, it is 
possible for some bits to become corrupted during the transmission process. For example, 
the sender may transmit a bit whose value is equal to 1, but the receiver observes a bit whose 
value is 0. Suppose that a data item consists of n bits. A simple error-checking mechanism 
can be implemented by including an extra bit, p, which indicates the parity of the n-bit item. 
Two kinds of parity can be used. For even parity the p bit is given the value such that the 
total number of Is in the 7 + 1 transmitted bits (comprising the -bit data and the parity bit 
p) is even. For odd parity the p bit is given the value that makes the total number of 1s odd. 
The sender generates the p bit based on the n-bit data item that is to be transmitted. The 
receiver checks whether the parity of the received item is correct. 

Parity generating and checking circuits can be realized with XOR gates. For example, 
for a four-bit data item consisting of bits x3.x2x;xo, the even parity bit can be generated as 


P=X3 Bx BX Bo 
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module shifter (W, Shift, Y, k); 
input [3:0] W; 
input Shift; 
output reg [3:0] Y; 
output reg k; 


always @(W, Shift) 
begin 
if (Shift) 
begin 
Y[3] =0; 
Y[2:0] = W[3:1]; 
k= W/[(0]; 
end 
else 
begin 
Y=W; 
k=0; 
end 
end 


endmodule 


Figure 4.53 Verilog code for the circuit in Figure 4.50. 


At the receiving end the checking is done using 
C=P Ox3 OB X2 OX OX 


Ifc = 0, then the received item shows the correct parity. Ifc = 1, then an error has occurred. 
Note that observing c = 0 is not a guarantee that the received item is correct. If two or any 
even number of bits have their values inverted during the transmission, the parity of the 
data item will not be changed; hence the error will not be detected. But if an odd number 
of bits are corrupted, then the error will be detected. 


Problem: The ASCII code, discussed in Section 1.5.3, uses seven-bit patterns to represent 
characters. In computer applications it is common to use one byte per character. The eighth 
bit, b7, is usually set to 0 for use in digital processing. But, if the the character data is to be 
transmitted from one digital system to another, it may be prudent to use bit b7 as a parity 
bit. Write Verilog code that specifies a circuit that accepts an input byte (where b7 = 0) and 
produces an output byte where b7 is the even parity bit. 


Solution: Let X and Y be the input and output bytes, respectively. Then, the desired 
solution is given in Figure 4.56. 
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module shifter (W, Shift, Y, k); 
input [3:0] W; 
input Shift; 
output reg [3:0] Y; 
output reg k; 


always @(W, Shift) 


begin 
if (Shift) 
begin 
Y=W>> 1; 
k= WI(0]; 
end 
else 
begin 
Y=W; 
k=0; 
end 
end 
endmodule 


Figure 4.54 = Alternative Verilog code for the circuit in 
Figure 4.50. 


module barrel (W, S, Y); 
input [3:0] W; 
input [1:0] S; 
output [3:0] Y; 
wire [3:0] T; 
assign {T, Y} ={W, W} >> S; 


endmodule 


Figure 4.55 —_ Verilog code for the barrel shifter. 


module parity (X, Y); 
input — [7:0] X; 
output [7:0] Y; 
assign Y = {X[6:0], X[6:0]}; 


endmodule 


Figure 4.56 —_ Verilog code for Example 4.35. 
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Answers to problems marked by an asterisk are given at the back of the book. 


4.1 


4.2 


*4.3 


4.4 
*4.5 


4.6 
4.7 


4.8 
4.9 
*4.10 


*4.11 


4.12 


*4.13 


4.16 


Show how the function f (w1, w2, w3) = >> m(0, 2, 3, 4, 5, 7) can be implemented using a 
3-to-8 binary decoder and an OR gate. 


Show how the function f (wi, w2, w3) = >> m(1, 2, 3,5, 6) can be implemented using a 
3-to-8 binary decoder and an OR gate. 


Consider the function f = w,w3 + w2w3 + W1W2. Use the truth table to derive a circuit for 
f that uses a 2-to-1 multiplexer. 


Repeat Problem 4.3 for the function f = W2w3 + w1W2. 


For the function f(w1, w2, w3) = >> m(0, 2, 3, 6), use Shannon’s expansion to derive an 
implementation using a 2-to-1 multiplexer and any other necessary gates. 


Repeat Problem 4.5 for the function f (w1, w2, w3) = >> m(0, 4, 6, 7). 


Consider the function f = W2 + W1W3 + w1;w3. Show how repeated application of Shan- 
non’s expansion can be used to derive the minterms of /. 


Repeat Problem 4.7 for f = w2 + 13. 
Prove Shannon’s expansion theorem presented in Section 4.1.2. 


Section 4.1.2 shows Shannon’s expansion in sum-of-products form. Using the principle of 
duality, derive the equivalent expression in product-of-sums form. 


Consider the function f = W,)W2 + W2W3 + w1W2W3. The cost of this minimal sum-of- 
products expression is 14, which includes four gates and 10 inputs to the gates. Use 
Shannon’s expansion to derive a multilevel circuit that has a lower cost and give the cost 
of your circuit. 


Use multiplexers to implement the circuit for stage 0 of the carry-lookahead adder in Figure 
3.15 (included in the right-most shaded area). 


Derive minimal sum-of-products expressions for the outputs a, b, and c of the 7-segment 
display in Figure 4.21. 


Derive minimal sum-of-products expressions for the outputs d, e, f, and g of the 7-segment 
display in Figure 4.21. 


For the function, f, in Example 4.26 perform Shannon’s expansion with respect to variables 
w, and wy, rather than w; and w4. How does the resulting circuit compare with the circuit 
in Figure 4.46? 


Consider the multiplexer-based circuit illustrated in Figure P4.1. Show how the function 
Ff = W2W3 + w,w3 + W2wW3 can be implemented using only one instance of this circuit. 
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Figure P4.1 = A multiplexer-based circuit. 


Show how the function f = wi w3 + W1wW3 + w2W3 + W1W2 can be realized using one or 
more instances of the circuit in Figure P4.1. Note that there are no NOT gates in the circuit; 
hence complements of signals have to be generated using the multiplexers in the logic block. 


Consider the Verilog code in Figure P4.2. What type of circuit does the code represent? 
Comment on whether or not the style of code used is a good choice for the circuit that it 
represents. 


module problem4_18 (W, En, yO, yl, y2, y3); 
input [1:0] W; 
input En; 
output reg yO, yl, y2, y3; 


always @(W, En) 


begin 
yO = 0; 
yl =0; 
y2 =0; 
y3 =0; 
if (En) 


if W==0) yO=1; 
else if (W==1) yl=1; 
else if (W == 2) y2=1; 
else y3=1; 

end 


endmodule 


Figure P4.2 = Code for Problem 4.18. 
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Write Verilog code that represents the function in Problem 4.2, using a case statement. 
Write Verilog code for a 4-to-2 binary encoder. 
Write Verilog code for an 8-to-3 binary encoder. 


Figure P4.3 shows a modified version of the code for a 2-to-4 decoder in Figure 4.37. This 
code is almost correct but contains one error. What is the error? 


module dec2to4 (W, En, Y); 
input [1:0] W; 
input En; 
output reg [0:3] Y; 
integer k; 


always @(W, En) 
for (k = 0;k <=3;k =k+1) 
if (W == k) 
Y(k] = En; 


endmodule 


Figure P4.3 = Code for Problem 4.22. 


Derive the circuit for an 8-to-3 priority encoder. 
Using a casex statement, write Verilog code for an 8-to-3 priority encoder. 
Repeat Problem 4.24, using a for loop. 


Create a Verilog module named if2to4 that represents a 2-to-4 binary decoder using an 
if-else statement. Create a second module named /3to8 that represents the 3-to-8 binary 
decoder in Figure 4.15 using two instances of the if2to4 module. 


Create a Verilog module named h6to64 that represents a 6-to-64 binary decoder. Use the 
treelike structure in Figure 4.16, in which the 6-to-64 decoder is built using nine instances 
of the h3to& decoder created in Problem 4.26. 


Write Verilog code that represents the circuit in Figure 4.17. Use the dec2to4 module in 
Figure 4.31 as a subcircuit in your code. 


Design a shifter circuit, similar to the one in Figure 4.50, which can shift a four-bit input 
vector, W = w3W2W Wo, one bit-position to the right when the control signal Right is equal 
to 1, and one bit-position to the left when the control signal Left is equal to 1. When Right 
= Left = 0, the output of the circuit should be the same as the input vector. Assume that 
the condition Right = Left = 1 will never occur. 


Design a circuit that can multiply an eight-bit number, A = a7,..., do, by 1, 2, 3 or 4 to 
produce the result A, 2A, 3A or 4A, respectively. 
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Write Verilog code that implements the task in Problem 4.30. 


Figure 4.47 depicts the relationship between the binary and Gray codes. Design a circuit 
that can convert Gray code into binary code. 


Example 4.35 and Figure 4.56 show how a circuit that generates an ASCII byte suitable for 
sending over a communications link may be defined. Write Verilog code for its counterpart 
at the receiving end, where byte Y (which includes the parity bit) has to be converted into 
byte X in which the bit x7 has to be 0. An error signal has to be produced, which is set to 
0 or 1 depending on whether the parity check indicates correct or erroneous transmission, 


respectively. 
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FLIP-FLOPS, REGISTERS, AND COUNTERS 


CHAPTER OBJECTIVES 


In this chapter you will learn about: 


e Logic circuits that can store information 

e Flip-flops, which store a single bit 

e Registers, which store multiple bits 

e Shift registers, which shift the contents of the register 
e Counters of various types 

e Verilog constructs used to implement storage elements 
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In previous chapters we considered combinational circuits where the value of each output depends solely on 
the values of signals applied to the inputs. There exists another class of logic circuits in which the values of the 
outputs depend not only on the present values of the inputs but also on the past behavior of the circuit. Such 
circuits include storage elements that store the values of logic signals. The contents of the storage elements 
are said to represent the state of the circuit. When the circuit’s inputs change values, the new input values 
either leave the circuit in the same state or cause it to change into a new state. Over time the circuit changes 
through a sequence of states as a result of changes in the inputs. Circuits that behave in this way are referred 
to as sequential circuits. 


In this chapter we will introduce circuits that can be used as storage elements. But first, we 
will motivate the need for such circuits by means of a simple example. Suppose that we wish 
to control an alarm system, as shown in Figure 5.1. The alarm mechanism responds to the 
control input On/Off. Itis turned on when On/Off = 1, and itis off when On/Off = 0. The 
desired operation is that the alarm turns on when the sensor generates a positive voltage 
signal, Set, in response to some undesirable event. Once the alarm is triggered, it must 
remain active even if the sensor output goes back to zero. The alarm is turned off manually 
by means of a Reset input. The circuit requires a memory element to remember that the 
alarm has to be active until the Reset signal arrives. 

Figure 5.2 gives a rudimentary memory element, consisting of a loop that has two 
inverters. If we assume that A = 0, then B = 1. The circuit will maintain these values 
indefinitely because of the feedback loop. We say that the circuit is in the state defined 
by these values. If we assume that A = 1, then B = 0, and the circuit will remain in this 
second state indefinitely. Thus the circuit has two possible states. This circuit is not useful, 
because it lacks some practical means for changing its state. Useful circuits that exhibit 
such memory behavior can be constructed with logic gates. 


Set 
Memory On/Off 


Alarm 


element 


Reset ———————"> 


Figure 5.1) == Control of an alarm system. 


Figure 5.2 — A simple memory element. 


5.1 Basic LATCH 
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Figure 5.3 presents a memory element built with NOR gates. Its inputs, Set and Reset, 
provide the means for changing the state, Q, of the circuit. A more usual way of drawing 
this circuit is given in Figure 5.4a, where the two NOR gates are said to be connected in 
cross-coupled style. The circuit is referred to as a basic latch. Its behavior is described by 
the table in Figure 5.4b. When both inputs, R and S, are equal to 0 the latch maintains its 
existing state because of the feedback loop. This state may be either Q, = 0 and Q, = I, 
or Q, = | and Q, = 0, which is indicated in the table by stating that the Q, and Q, outputs 
have values 0/1 and 1/0, respectively. Observe that Q, and Q, are complements of each 
other in this case. When R = | and S = 0, the latch is reset into a state where Q, = 0 and 
Q, = 1. When R = Oand S = 1, the latch is set into a state where Q, = 1 and Q, = 0. The 
fourth possibility is to have R = S = 1. In this case both Q, and Q, will be 0. The table in 
Figure 5.4b resembles a truth table. However, since it does not represent a combinational 
circuit in which the values of the outputs are determined solely by the current values of the 
inputs, it is often called a characteristic table rather than a truth table. 

Figure 5.4c gives a timing diagram for the latch, assuming that the propagation delay 
through the NOR gates is negligible. Of course, in areal circuit the changes in the waveforms 
would be delayed according to the propagation delays of the gates. We assume that initially 
Q,, = Oand Q, = 1. The state of the latch remains unchanged until time f, when S becomes 
equal to 1, causing Q, to change to 0, which in turn causes Q, to change to | because 
R+ Q, = 1. The causality relationship is indicated by the arrows in the diagram. When S 
goes to 0 at #3, there is no change in the state because both S and R are then equal to 0. At 
t4 we have R = 1, which causes Q,, to go to 0, which in turn causes Q, to go to 1 because 
S+Q,= 1. Att; both S and R are equal to 1, which forces both Q,, and Q, to be equal 
to 0. As soon as S returns to 0, at f¢, Q,, becomes equal to | again. At tg we have S = 1 
and R = 0, which causes Q, = 0 and Q, = I. An interesting situation occurs at fj9. From 
ty to tig we have Q, = Q, = 0 because R = S = 1. Now if both R and S change to 0 at 
tio, both Q, and Q, will go to 1. But having both Q, and Q, equal to 1 will immediately 
force Q, = Q, = 0. There will be an oscillation between Q, = Q, = 0 and Q, = Q, = I. 
If the delays through the two NOR gates are exactly the same, the oscillation will continue 
indefinitely. In a real circuit there will invariably be some difference in the delays through 
these gates, and the latch will eventually settle into one of its two stable states, but we don’t 
know which state it will be. This uncertainty is indicated in the waveforms by dashed lines. 


Reset 


Figure 5.3. A memory element with NOR gates. 
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R 
Qa 
0 0 0/1 1/0 (no change) 
0 1 0 1 
1 O 1 O 
Q 1 1 0 O 
5 b 
(a) Circuit (b) Characteristic table 
1 
R 
0 
1 
S 
0 
1 
Qu 
0 
1 
Qs 
0 


(c) Timing diagram 


Figure 5.4 — A basic latch built with NOR gates. 


The oscillations discussed above illustrate that even though the basic latch is a simple 
circuit, careful analysis has to be done to fully appreciate its behavior. In general, any 
circuit that contains one or more feedback paths, such that the state of the circuit depends 
on the propagation delays through logic gates, has to be designed carefully. We discuss 
timing issues in Section 5.15 and in Chapter 9. 

The latch in Figure 5.4a can perform the functions needed for the memory element in 
Figure 5.1, by connecting the Set signal to the S input and Reset to the R input. The Q, 
output provides the desired On/Off signal. To initialize the operation of the alarm system, 
the latch is reset. Thus the alarm is off. When the sensor generates the logic value 1, the 
latch is set and Q, becomes equal to 1. This turns on the alarm mechanism. If the sensor 
output returns to 0, the latch retains its state where Q, = 1; hence the alarm remains turned 


5.2. GATED SR LaTcH 


on. The only way to turn off the alarm is by resetting the latch, which is accomplished by 
making the Reset input equal to 1. 
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In Section 5.1 we saw that the basic SR latch can serve as a useful memory element. It 
remembers its state when both the S and R inputs are 0. It changes its state in response 
to changes in the signals on these inputs. The state changes occur at the time when the 
changes in the signals occur. If we cannot control the time of such changes, then we don’t 
know when the latch may change its state. 

In the alarm system of Figure 5.1, it may be desirable to be able to enable or disable 
the entire system by means of a control input, Enable. Thus when enabled, the system 
would function as described above. In the disabled mode, changing the Set input from 0 to 
1 would not cause the alarm to turn on. The latch in Figure 5.4a cannot provide the desired 
operation. But the latch circuit can be modified to respond to the input signals S and R only 
when Enable = 1. Otherwise, it would maintain its state. 

The modified circuit is depicted in Figure 5.5a. It includes two AND gates that provide 
the desired control. When the control signal C/k is equal to 0, the S’ and R’ inputs to the 
latch will be 0, regardless of the values of signals S and R. Hence the latch will maintain 
its existing state as long as C/k = 0. When Clk changes to 1, the S’ and R’ signals will be 
the same as the S and R signals, respectively. Therefore, in this mode the latch will behave 
as we described in Section 5.1. Note that we have used the name C/k for the control signal 
that allows the latch to be set or reset, rather than call it the Enable signal. The reason is that 
such circuits are often used in digital systems where it is desirable to allow the changes in 
the states of memory elements to occur only at well-defined time intervals, as if they were 
controlled by a clock. The control signal that defines these time intervals is usually called 
the clock signal. The name Clk is meant to reflect this nature of the signal. 

Circuits of this type, which use a control signal, are called gated latches. Because our 
circuit exhibits set and reset capability, it is called a gated SR latch. Figure 5.5b describes 
its behavior. It defines the state of the Q output at time t+ 1, namely, Q(f+ 1), as a 
function of the inputs S, R, and Clk. When Clk = 0, the latch will remain in the state it 
is in at time f¢, that is, Q(t), regardless of the values of inputs S and R. This is indicated 
by specifying S = x and R = x, where x means that it does not matter if the signal value 
is 0 or 1. When Clk = 1, the circuit behaves as the basic latch in Figure 5.4. It is set by 
S = | andreset by R = 1. The last row of the table, where § = R = 1, shows that the state 
Q(t + 1) is undefined because we don’t know whether it will be 0 or 1. This corresponds 
to the situation described in Section 5.1 in conjunction with the timing diagram in Figure 
5.4 at time t;9. At this time both $ and R inputs go from | to 0, which causes the oscillatory 
behavior that we discussed. If S = R = 1, this situation will occur as soon as Clk goes from 
1 to 0. To ensure a meaningful operation of the gated SR latch, it is essential to avoid the 
possibility of having both the S and R inputs equal to 1 when C/k changes from | to 0. 

A timing diagram for the gated SR latch is given in Figure 5.5c. It shows Clk as a 
periodic signal that is equal to | at regular time intervals to suggest that this is how the 
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R Ck S R Q(t+1) 
QO . x QA (no change) 
1 0 O QA (no change) 
ae 1 01 0 
1 1 0O 1 
1 1 1 x 
S Ss’ 
(a) Circuit (b) Characteristic table 


Clk 


(c) Timing diagram 


S Q 
Clk 
R Q 


(d) Graphical symbol 


Figure 5.5 — Gated SR latch. 
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S 
Q 

Clk 
Q 

R 


Figure 5.6 — Gated SR latch with NAND gates. 


clock signal usually appears in a real system. The diagram presents the effect of several 
combinations of signal values. Observe that we have labeled one output as Q and the other 
as its complement Q, rather than Q,, and Q, as in Figure 5.4. Since the undefined mode, 
where S = R = 1, must be avoided in practice, the normal operation of the latch will have 
the outputs as complements of each other. Moreover, we will often say that the latch is set 
when Q = 1, and it is reset when Q = 0. A graphical symbol for the gated SR latch is given 
in Figure 5.5d. 


5.2.1 GATED SR LATCH wiTH NAND GATES 


So far we have implemented the basic latch with cross-coupled NOR gates. We can also 
construct the latch with NAND gates. Using this approach, we can implement the gated 
SR latch as depicted in Figure 5.6. The behavior of this circuit is described by the table 
in Figure 5.5b. Note that in this circuit, the clock is gated by NAND gates, rather than by 
AND gates. Note also that the S and R inputs are reversed in comparison with the circuit 
in Figure 5.5a. 
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In Section 5.2 we presented the gated SR latch and showed how it can be used as the memory 
element in the alarm system of Figure 5.1. This latch is useful for many other applications. 
In this section we describe another gated latch that is even more useful in practice. It has a 
single data input, called D, and it stores the value on this input, under the control of a clock 
signal. It is called a gated D latch. 

To motivate the need for a gated D latch, consider the adder/subtractor unit discussed 
in Chapter 3 (Figure 3.12). When we described how that circuit is used to add numbers, we 
did not discuss what is likely to happen with the sum bits that are produced by the adder. 
Adder/subtractor units are often used as part of a computer. The result of an addition or 
subtraction operation is often used as an operand in a subsequent operation. Therefore, it 
is necessary to be able to remember the values of the sum bits generated by the adder until 
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they are needed again. We might think of using the basic latches to remember these bits, 
one bit per latch. In this context, instead of saying that a latch remembers the value of a 
bit, it is more illuminating to say that the latch stores the value of the bit or simply “stores 
the bit.” We should think of the latch as a storage element. 

But can we obtain the desired operation using the basic latches? We can certainly reset 
all latches before the addition operation begins. Then we would expect that by connecting 
a sum bit to the S$ input of a latch, the latch would be set to | if the sum bit has the value 
1; otherwise, the latch would remain in the 0 state. This would work fine if all sum bits 
are 0 at the start of the addition operation and, after some propagation delay through the 
adder, some of these bits become equal to | to give the desired sum. Unfortunately, the 
propagation delays that exist in the adder circuit cause a big problem in this arrangement. 
Suppose that we use a ripple-carry adder. When the X and Y inputs are applied to the adder, 
the sum outputs may alternate between 0 and | a number of times as the carries ripple 
through the circuit. The problem is that if we connect a sum bit to the S' input of a latch, 
then if the sum bit is temporarily a | and then settles to 0 in the final result, the latch will 
remain set to | erroneously. 

The problem caused by the alternating values of the sum bits in the adder could be 
solved by using the gated SR latches, instead of the basic latches. Then we could arrange 
that the clock signal is 0 during the time needed by the adder to produce a correct sum. 
After allowing for the maximum propagation delay in the adder circuit, the clock should 
go to | to store the values of the sum bits in the gated latches. As soon as the values have 
been stored, the clock can return to 0, which ensures that the stored values will be retained 
until the next time the clock goes to 1. To achieve the desired operation, we would also 
have to reset all latches to 0 prior to loading the sum-bit values into these latches. This is 
an awkward way of dealing with the problem, and it is preferable to use the gated D latches 
instead. 

Figure 5.7a shows the circuit for a gated D latch. It is based on the gated SR latch, but 
instead of using the S and R inputs separately, it has just one data input, D. For convenience 
we have labeled the points in the circuit that are equivalent to the S and R inputs. If D = 1, 
then S = 1 and R = 0, which forces the latch into the state Q = 1. If D=0, then S = 0 
and R = 1, which causes Q = 0. Of course, the changes in state occur only when Clk = 1. 

In this circuit it is impossible to have the troublesome situation where S = R= 1. In 
the gated D latch, the output Q merely tracks the value of the input D while Clk = 1. As 
soon as Clk goes to 0, the state of the latch is frozen until the next time the clock signal goes 
to 1. Therefore, the gated D latch stores the value of the D input seen at the time the clock 
changes from | to 0. Figure 5.7 also gives the characteristic table, the graphical symbol, 
and a timing diagram for the gated D latch. 

The timing diagram illustrates what happens if the D signal changes while Clk = 1. 
During the third clock pulse, starting at f;, the output Q changes to 1 because D = 1. But 
midway through the pulse D goes to 0, which causes Q to go to 0. This value of Q is stored 
when Clk changes to 0. Now no further change in the state of the latch occurs until the 
next clock pulse, at t4. The key point to observe is that as long as the clock has the value 1, 
the Q output follows the D input. But when the clock has the value 0, the Q output cannot 
change. The logic values are implemented as low and high voltage levels, as explained in 
detail in Appendix B. Since the output of the gated D latch is controlled by the level of the 
clock input, the latch is said to be level sensitive. The circuits in Figures 5.5 through 5.7 are 
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(d) Timing diagram 


Figure 5.7} ~— Gated D latch. 


level sensitive. We will show in Section 5.4 that it is possible to design storage elements 
for which the output changes only at the point in time when the clock changes from one 
value to the other. Such circuits are said to be edge triggered. 


5.3.1 EFFECTS OF PROPAGATION DELAYS 


In the previous discussion we ignored the effects of propagation delays. In practical circuits 
it is essential to take these delays into account. Consider the gated D latch in Figure 5.7a. 
It stores the value of the D input that is present at the time the clock signal changes from 
1 to 0. It operates properly if the D signal is stable (that is, not changing) at the time Clk 
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Figure 5.8 — Setup and hold times. 


goes from | to 0. But it may lead to unpredictable results if the D signal also changes at 
this time. Therefore, the designer of a logic circuit that generates the D signal must ensure 
that this signal is stable when the critical change in the clock signal takes place. 

Figure 5.8 illustrates the critical timing region. The minimum time that the D signal 
must be stable prior to the negative edge of the Clk signal is called the setup time, ts,, of 
the latch. The minimum time that the D signal must remain stable after the negative edge 
of the Clk signal is called the hold time, t,, of the latch. The values of f,, and t, depend 
on the technology used. Manufacturers of integrated circuit chips provide this information 
on the data sheets that describe their chips. Typical values for a modern technology may 
be t;, = 0.3 ns and t, = 0.2 ns. We will give examples of how setup and hold times affect 
the speed of operation of circuits in Section 5.15. The behavior of storage elements when 
setup or hold times are violated is discussed in Chapter 7. 


5.4 EpGE-TRIGGERED D FLIP-FLOPS 


In the level-sensitive latches, the state of the latch keeps changing according to the values of 
input signals during the period when the clock signal is active (equal to 1 in our examples). 
As we will see in Sections 5.8 and 5.9, there is also a need for storage elements that can 
change their states no more than once during one clock cycle. We will now discuss circuits 
that exhibit such behavior. 


5.4.1 MASTER-SLAVE D FLIP-FLOP 


Consider the circuit given in Figure 5.9a, which consists of two gated D latches. The first, 
called master, changes its state while Clock = 1. The second, called slave, changes its state 
while Clock = 0. The operation of the circuit is such that when the clock is high, the master 
tracks the value of the D input signal and the slave does not change. Thus the value of Q,,, 
follows any changes in D, and the value of Q, remains constant. When the clock signal 


5.4 EDGE-TRIGGERED D FLIP-FLOPS 


Master Slave 
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(a) Circuit 


Clock 


(b) Timing diagram 


q> 


(c) Graphical symbol 


Figure 5.9 —_ Master-slave D flip-flop. 


changes to 0, the master stage stops following the changes in the D input. At the same time, 
the slave stage responds to the value of the signal Q,,, and changes state accordingly. Since 
Q,, does not change while Clock = 0, the slave stage can undergo at most one change of 
state during a clock cycle. From the external observer’s point of view, namely, the circuit 
connected to the output of the slave stage, the master-slave circuit changes its state at the 
negative-going edge of the clock. The negative edge is the edge where the clock signal 
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changes from | to 0. Regardless of the number of changes in the D input to the master 
stage during one clock cycle, the observer of the Q, signal will see only the change that 
corresponds to the D input at the negative edge of the clock. 

The circuit in Figure 5.9 is called a master-slave D flip-flop. The term flip-flop denotes 
a storage element that changes its output state at the edge of a controlling clock signal. The 
timing diagram for this flip-flop is shown in Figure 5.9b. A graphical symbol is given in 
Figure 5.9c. In the symbol we use the > mark to denote that the flip-flop responds to the 
“active edge” of the clock. We place a bubble on the clock input to indicate that the active 
edge for this particular circuit is the negative edge. 

We can augment the circuit in Figure 5.9a by reversing the connections of the clock 
signal to the master and slave stages. That is, include an inverter in the clock input to the 
master stage and connect the uncomplemented clock signal to the slave stage. Now, the state 
of the master stage will be transferred into the slave stage when the clock signal changes 
from 0 to 1. This circuit behaves as a positive-edge-triggered master-slave D flip-flop. It 
can be represented by the graphical symbol in Figure 5.9c where the bubble on the clock 
input is removed. 


Level-Sensitive versus Edge-Triggered Storage Elements 

At this point it is useful to compare the timing in the various storage elements that 
we have discussed. Figure 5.10 shows three different types of storage elements that are 
driven by the same data and clock inputs. The first element is a gated D latch, which is 
level sensitive. The second one is a positive-edge-triggered D flip-flop, and the third one is 
a negative-edge-triggered D flip-flop. To accentuate the differences between these storage 
elements, the D input changes its values more than once during each half of the clock 
cycle. Observe that the gated D latch follows the D input as long as the clock is high. The 
positive-edge-triggered flip-flop responds only to the value of D when the clock changes 
from 0 to 1. The negative-edge-triggered flip-flop responds only to the value of D when 
the clock changes from | to 0. 


5.4.2 OTHER TYPES OF EDGE-TRIGGERED D FLIP-FLOPS 


Master-slave flip-flops illustrate the concept of edge triggering very clearly. Other circuits 
have been used to accomplish the same task. Consider the circuit presented in Figure 5.1 1a. 
It requires only six NAND gates and, hence, fewer transistors than the master-slave circuit. 
The operation of the circuit is as follows. When Clock = 0, the outputs of gates 2 and 3 
are high. Thus P1 = P2 = 1, which maintains the output latch, comprising gates 5 and 6, 
in its present state. At the same time, the signal P3 is equal to D, and P4 is equal to its 
complement D. When Clock changes to 1, the following changes take place. The values 
of P3 and P4 are transmitted through gates 2 and 3 to cause P1 = D and P2 = D, which 
sets Q = D and Q = D. To operate reliably, P3 and P4 must be stable when Clock changes 
from 0 to 1. Hence the setup time of the flip-flop is equal to the delay from the D input 
through gates 4 and | to P3. The hold time is given by the delay through gate 3 because 
once P2 is stable, the changes in D no longer matter. 
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(a) Circuit 
Clock 
D 
Q 
Q, 
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(b) Timing diagram 


Figure 5.10 = Comparison of level-sensitive and edge-triggered D storage elements. 


For proper operation it is necessary to show that after Clock changes to 1 any further 
changes in D will not affect the output latch as long as Clock = 1. We have to consider two 
cases. Suppose first that D = 0 at the positive edge of the clock. Then P2 = 0, which will 
keep the output of gate 4 equal to | as long as Clock = 1, regardless of the value of the D 
input. The second case is if D = | at the positive edge of the clock. Then Pl = 0, which 
forces the outputs of gates 1 and 3 to be equal to 1, regardless of the D input. Therefore, 
the flip-flop ignores changes in the D input while Clock = 1. 
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(a) Circuit 


Clock > 


(b) Graphical symbol 


Figure 5.11 A positive-edge-triggered D flip-flop. 


This circuit behaves as a positive-edge-triggered D flip-flop. A similar circuit, con- 
structed with NOR gates, can be used as a negative-edge-triggered flip-flop. 


5.4.3. D FiLie-FLOPS WITH CLEAR AND PRESET 


Flip-flops are often used for implementation of circuits that can have many possible states, 
where the response of the circuit depends not only on the present values of the circuit’s 
inputs but also on the particular state that the circuit is in at that time. We will discuss 
a general form of such circuits in Chapter 6. A simple example is a counter circuit that 
counts the number of occurrences of some event, perhaps passage of time. We will discuss 
counters in detail in Section 5.9. A counter comprises a number of flip-flops, whose outputs 
are interpreted as anumber. The counter circuit has to be able to increment or decrement the 
number. It is also important to be able to force the counter into a known initial state (count). 
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(a) Circuit 


Preset_n 
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(b) Graphical symbol 


Figure 5.12 —Master-slave D flip-flop with Clear and Preset. 


Obviously, it must be possible to clear the count to zero, which means that all flip-flops 
must have Q = 0. It is equally useful to be able to preset each flip-flop to Q = 1, to insert 
some specific count as the initial value in the counter. These features can be incorporated 
into the circuits of Figures 5.9 and 5.11 as follows. 

Figure 5.12a shows an implementation of the circuit in Figure 5.9a using NAND gates. 
The master stage is just the gated D latch of Figure 5.7a. Instead of using another latch 
of the same type for the slave stage, we can use the slightly simpler gated SR latch of 
Figure 5.6. This eliminates one NOT gate from the circuit. 

A simple way of providing the clear and preset capability is to add an extra input to 
each NAND gate in the cross-coupled latches, as indicated in blue. Placing a 0 on the 
Clear_n input will force the flip-flop into the state Q = 0. If Clear_n = 1, then this input 
will have no effect on the NAND gates. Similarly, Preset_n = 0 forces the flip-flop into 
the state Q = 1, while Preset_n = | has no effect. To denote that the Clear_n and Preset_n 
inputs are active when their value is 0, we appended the letter n (for “negative’’) to these 
names. We should note that the circuit that uses this flip-flop should not try to force both 
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(b) Graphical symbol (c) Adding a synchronous clear 


Figure 5.13 _Positive-edge-triggered D flip-flop with Clear and Preset. 


Clear_n and Preset_n to 0 at the same time. A graphical symbol for this flip-flop is shown 
in Figure 5.12b. 

A similar modification can be done on the edge-triggered flip-flop of Figure 5.11a, as 
indicated in Figure 5.13a. Again, both Clear_n and Preset_n inputs are active low. They 
do not disturb the flip-flop when they are equal to 1. 

In the circuits in Figures 5.12a and 5.13a, the effect of a low signal on either the Clear_n 
or Preset_n input is immediate. For example, if Clear_n = 0 then the flip-flop goes into 
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the state Q = 0 immediately, regardless of the value of the clock signal. In such a circuit, 
where the Clear_n signal is used to clear a flip-flop without regard to the clock signal, we 
say that the flip-flop has an asynchronous clear. In practice, it is often preferable to clear 
the flip-flops on the active edge of the clock. Such synchronous clear can be accomplished 
as shown in Figure 5.13c. The flip-flop operates normally when the Clear_n input is equal 
to 1. But if Clear_n goes to 0, then on the next positive edge of the clock the flip-flop will 
be cleared to 0. We will examine the clearing of flip-flops in more detail in Section 5.10. 


5.4.4 Fiipe-FLop TIMING PARAMETERS 


In Section 5.3.1 we discussed timing issues related to latch circuits. In practice such 
issues are equally important for circuits with flip-flops. Figure 5.14a shows a positive- 
edge triggered flip-flop with asynchronous clear, and part (b) of the figure illustrates some 
important timing parameters for this flip-flop. Data is loaded into the D input of the flip- 
flop on a positive clock edge, and this logic value must be stable during the setup time, f,,,, 
before the clock edge occurs. The data must remain stable during the hold time, t,,, after the 
edge. If the setup or hold requirements are not adhered to in a circuit that uses this flip-flop, 
then it may enter an unstable condition known as metastability; we discuss this concept in 
Chapter 7. 

As indicated in Figure 5.14, a clock-to-Q propagation delay, f,q, is incurred before the 
value of Q changes after a positive clock edge. In general, the delay may not be exactly 
the same for the cases when Q changes from | to 0 or 0 to 1, but we assume for simplicity 
that these delays are equal. For the flip-flops in a commercial chip, two values are usually 
specified for fq, representing the maximum and minimum delays that may occur in practice. 
Specifying a range of values when estimating the delays in a chip is acommon practice due 
to many sources of variation in delay that are caused by the chip manufacturing process. 
In Section 5.15 we provide some examples that illustrate the effects of flip-flop timing 
parameters on the operation of circuits. 
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The D flip-flop is a versatile storage element that can be used for many purposes. By 
including some simple logic circuitry to drive its input, the D flip-flop may appear to be a 
different type of storage element. An interesting modification is presented in Figure 5.15a. 
This circuit uses a positive-edge-triggered D flip-flop. The feedback connections make 
the input signal D equal to either the value of Q or Q under the control of the signal that 
is labeled T. On each positive edge of the clock, the flip-flop may change its state Q(t). 
If T = 0, then D = Q and the state will remain the same, that is, Q(t + 1) = Q(t). But 
if T = 1, then D = Q and the new state will be Q(t + 1) = Q(t). Therefore, the overall 
operation of the circuit is that it retains its present state if 7 = 0, and it reverses its present 
state if T = 1. 
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(b) Timing diagram 


Figure 5.14 _Flip-flop timing parameters. 


The operation of the circuit is specified in the form of a characteristic table in Fig- 
ure 5.15b. Any circuit that implements this table is called a T flip-flop. The name T flip-flop 
derives from the behavior of the circuit, which “toggles” its state when T = 1. The toggle 
feature makes the T flip-flop a useful element for building counter circuits, as we will see 
in Section 5.9. 


| 5.6 JK Fiip-FLop 


Another interesting circuit can be derived from Figure 5.15a. Instead of using a single 
control input, 7, we can use two inputs, J and K, as indicated in Figure 5.16a. For this 
circuit the input D is defined as 


D=JQ+KQ 


5.6 JK FLip-FLop 


(b) Characteristic table (c) Graphical symbol 


Clock 


(d) Timing diagram 


Figure 5.15 _‘T flip-flop. 


A corresponding characteristic table is given in Figure 5.16b. The circuit is called a JK 
flip-flop. It combines the behaviors of SR and T flip-flops in a useful way. It behaves as 
the SR flip-flop, where J = S and K = R, for all input values except J = K = 1. For the 
latter case, which has to be avoided in the SR flip-flop, the JK flip-flop toggles its state like 
the T flip-flop. 

The JK flip-flop is a versatile circuit. It can be used for straight storage purposes, just 
like the D and SR flip-flops. But it can also serve as a T flip-flop by connecting the J and 
K inputs together. 
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Figure 5.16 JK flip-flop. 


| 5.7 SUMMARY OF TERMINOLOGY 


We have used the terminology that is quite common. But the reader should be aware that 
different interpretations of the terms /atch and flip-flop can be found in the literature. Our 
terminology can be summarized as follows: 


Basic latch is a feedback connection of two NOR gates or two NAND gates, which 
can store one bit of information. It can be set to | using the S input and reset to 0 
using the R input. 

Gated latch is a basic latch that includes input gating and a control input signal. The 
latch retains its existing state when the control input is equal to 0. Its state may be 
changed when the control signal is equal to 1. In our discussion we referred to the 
control input as the clock. We considered two types of gated latches: 


e¢ Gated SR latch uses the S and R inputs to set the latch to | or reset it to 0, 
respectively. 


¢ Gated D latch uses the D input to force the latch into a state that has the same 
logic value as the D input. 


5.8 REGISTERS 


A flip-flop is a storage element that can have its output state changed only on the edge 
of the controlling clock signal. If the state changes when the clock signal goes from 0 
to 1, we say that the flip-flop is positive-edge triggered. If the state changes when 

the clock signal goes from | to 0, we say that the flip-flop is negative-edge triggered. 
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A flip-flop stores one bit of information. When a set of n flip-flops is used to store n bits of 
information, such as an -bit number, we refer to these flip-flops as a register. A common 
clock is used for each flip-flop in a register, and each flip-flop operates as described in the 
previous sections. The term register is merely a convenience for referring to n-bit structures 
consisting of flip-flops. 


5.8.1 Suirr REGISTER 


In Section 3.6 we explained that a given number is multiplied by 2 if its bits are shifted 
one bit position to the left and a 0 is inserted as the new least-significant bit. Similarly, the 
number is divided by 2 if the bits are shifted one bit-position to the right. A register that 
provides the ability to shift its contents is called a shift register. 

Figure 5.17a shows a four-bit shift register that is used to shift its contents one bit- 
position to the right. The data bits are loaded into the shift register in a serial fashion using 
the Jn input. The contents of each flip-flop are transferred to the next flip-flop at each 
positive edge of the clock. An illustration of the transfer is given in Figure 5.17b, which 
shows what happens when the signal values at /n during eight consecutive clock cycles are 
1,0, 1, 1, 1, 0, 0, and 0, assuming that the initial state of all flip-flops is 0. 

To implement a shift register it is necessary to use flip-flops. The level-sensitive gated 
latches are not suitable, because a change in the value of Jn would propagate through more 
than one latch during the time when the clock is equal to 1. 


5.8.2 PARALLEL-ACCESS SHIFT REGISTER 


In computer systems it is often necessary to transfer n-bit data items. This may be done by 
transmitting all bits at once using n separate wires, in which case we say that the transfer 
is performed in parallel. But it is also possible to transfer all bits using a single wire, by 
performing the transfer one bit at a time, in m consecutive clock cycles. We refer to this 
scheme as serial transfer. To transfer an n-bit data item serially, we can use a shift register 
that can be loaded with all n bits in parallel (in one clock cycle). Then during the next n 
clock cycles, the contents of the register can be shifted out for serial transfer. The reverse 
operation is also needed. If bits are received serially, then after n clock cycles the contents 
of the register can be accessed in parallel as an n-bit item. 
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Figure 5.17. — A simple shift register. 


Figure 5.18 shows a four-bit shift register that provides the parallel access. A 2-to-1 
multiplexer on its D input allows each flip-flop to be connected to two different sources. 
One source is the preceding flip-flop, which is needed for the shift-register operation. The 
other source is the external input that corresponds to the bit that is to be loaded into the 
flip-flop as a part of the parallel-load operation. The control signal Shift/Load is used to 
select the mode of operation. If Shift/Load = 0, then the circuit operates as a shift register. 
If Shift/Load = 1, then the parallel input data are loaded into the register. In both cases the 
action takes place on the positive edge of the clock. 

In Figure 5.18 we have chosen to label the flip-flops’ outputs as Q;,..., Qy because 
shift registers are often used to hold binary numbers. The contents of the register can be 
accessed in parallel by observing the outputs of all flip-flops. The flip-flops can also be 
accessed serially, by observing the values of Qg during consecutive clock cycles while the 
contents are being shifted. A circuit in which data can be loaded in series and then accessed 
in parallel is called a series-to-parallel converter. Similarly, the opposite type of circuit is a 
parallel-to-series converter. The circuit in Figure 5.18 can perform both of these functions. 


5.9 COUNTERS 


Parallel output 


Shift/Load 


Serial 
input 


Clock 


Parallel input 


Figure 5.18 —_Parallel-access shift register. 
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In Chapter 3 we dealt with circuits that perform arithmetic operations. We showed how 
adder/subtractor circuits can be designed, either using a simple cascaded (ripple-carry) 
structure that is inexpensive but slow or using a more complex carry-lookahead structure 
that is both more expensive and faster. In this section we examine special types of addition 
and subtraction operations, which are used for the purpose of counting. In particular, we 
want to design circuits that can increment or decrement a count by 1. Counter circuits are 
used in digital systems for many purposes. They may count the number of occurrences of 
certain events, generate timing intervals for control of various tasks in a system, keep track 
of time elapsed between specific events, and so on. 

Counters can be implemented using the adder/subtractor circuits discussed in Chapter 3 
and the registers discussed in Section 5.8. However, since we only need to change the 
contents of a counter by 1, it is not necessary to use such elaborate circuits. Instead, we 
can use much simpler circuits that have a significantly lower cost. We will show how the 
counter circuits can be designed using T and D flip-flops. 


5.9.1 ASYNCHRONOUS COUNTERS 


The simplest counter circuits can be built using T flip-flops because the toggle feature is 
naturally suited for the implementation of the counting operation. 
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Up-Counter with T Flip-Flops 

Figure 5.19a gives a three-bit counter capable of counting from 0 to 7. The clock inputs 
of the three flip-flops are connected in cascade. The T input of each flip-flop is connected 
to a constant 1, which means that the state of the flip-flop will be reversed (toggled) at each 
positive edge of its clock. We are assuming that the purpose of this circuit is to count the 
number of pulses that occur on the primary input called Clock. Thus the clock input of 
the first flip-flop is connected to the Clock line. The other two flip-flops have their clock 
inputs driven by the Q output of the preceding flip-flop. Therefore, they toggle their state 
whenever the preceding flip-flop changes its state from Q = | to Q = 0, which results in a 
positive edge of the Q signal. 

Figure 5.19b shows a timing diagram for the counter. The value of Qy toggles once each 
clock cycle. The change takes place shortly after the positive edge of the Clock signal. The 
delay is caused by the propagation delay through the flip-flop. Since the second flip-flop 
is clocked by Qo, the value of Q, changes shortly after the negative edge of the Qo signal. 
Similarly, the value of Q, changes shortly after the negative edge of the Q, signal. If we 
look at the values Q,Q,Qg as the count, then the timing diagram indicates that the counting 


(a) Circuit 
Clock 
Qo 
Q) 
Q 
Count 0 1 2 3 4 5 6 7 0 


(b) Timing diagram 


Figure 5.19 A three-bit up-counter. 


5.9 COUNTERS 


sequence is 0, 1, 2, 3, 4, 5, 6, 7, 0, 1, and so on. This circuit is a modulo-8 counter. Because 
it counts in the upward direction, we call it an up-counter. 

The counter in Figure 5.19a has three stages, each comprising a single flip-flop. Only 
the first stage responds directly to the Clock signal; we say that this stage is synchronized 
to the clock. The other two stages respond after an additional delay. For example, when 
Count = 3, the next clock pulse will cause the Count to go to 4. As indicated by the arrows 
in the timing diagram in Figure 5.19b, this change requires the toggling of the states of 
all three flip-flops. The change in Qy is observed only after a propagation delay from the 
positive edge of Clock. The Q, and Q, flip-flops have not yet changed; hence for a brief 
time the count is Q,Q;Qy = 010. The change in Q, appears after a second propagation 
delay, at which point the count is 000. Finally, the change in Q, occurs after a third delay, 
at which point the stable state of the circuit is reached and the count is 100. This behavior is 
similar to the rippling of carries in the ripple-carry adder circuit of Figure 3.5. The circuit 
in Figure 5.19a is an asynchronous counter, or a ripple counter. 


Down-Counter with T Flip-Flops 


A slight modification of the circuit in Figure 5.19a is presented in Figure 5.20a. The 
only difference is that in Figure 5.20a the clock inputs of the second and third flip-flops are 


(a) Circuit 
Clock 
Qo 
Q 
Q 
Count 0 7 6 5 4 3 2 1 0 


(b) Timing diagram 


Figure 5.20 =A three-bit down-counter. 
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driven by the Q outputs of the preceding stages, rather than by the Q outputs. The timing 
diagram, given in Figure 5.20b, shows that this circuit counts in the sequence 0, 7, 6, 5, 4, 
3, 2, 1, 0, 7, and so on. Because it counts in the downward direction, we say that it is a 
down-counter. 

It is possible to combine the functionality of the circuits in Figures 5.19a and 5.20a 
to form a counter that can count either up or down. Such a counter is called an up/down- 
counter. We leave the derivation of this counter as an exercise for the reader (Problem 5.15). 


5.9.2 SYNCHRONOUS COUNTERS 


The asynchronous counters in Figures 5.19a and 5.20a are simple, but not very fast. If a 
counter with a larger number of bits is constructed in this manner, then the delays caused 
by the cascaded clocking scheme may become too long to meet the desired performance 
requirements. We can build a faster counter by clocking all flip-flops at the same time, 
using the approach described below. 


Synchronous Counter with T Flip-Flops 

Table 5.1 shows the contents of a three-bit up-counter for eight consecutive clock 
cycles, assuming that the count is initially 0. Observing the pattern of bits in each row of 
the table, it is apparent that bit Q) changes on each clock cycle. Bit Q,; changes only when 
Qy = 1. Bit Q, changes only when both Q, and Qg are equal to 1. In general, for an n-bit 
up-counter, a given flip-flop changes its state only when all the preceding flip-flops are in 
the state Q = |. Therefore, if we use T flip-flops to realize the counter, then the T inputs 
are defined as 


To =1 
T; =Q) 
T = QQ; 


Table 5.1 _ Derivation of the synchronous 
up-counter. 
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T3 = QQ:Q) 


T, = QoQ: 7 Q,-1 


An example of a four-bit counter based on these expressions is given in Figure 5.2la. 
Instead of using AND gates of increased size for each stage, we use a factored arrangement 
as shown in the figure. This arrangement does not slow down the response of the counter, 
because all flip-flops change their states after a propagation delay from the positive edge 
of the clock. Note that a change in the value of Qy may have to propagate through several 
AND gates to reach the flip-flops in the higher stages of the counter, which requires a certain 


Clock 


- aH 
ror 


Count 0 1 2 3 4 5 6 7 8 9 10 11 12 13 144 15 0 1 


(b) Timing diagram 


Figure 5.21 A four-bit synchronous up-counter. 
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Clear_n 


Figure 5.22 _ Inclusion of Enable and Clear capability. 


amount of time. This time must not exceed the clock period. Actually, it must be less than 
the clock period minus the setup time for the flip-flops. 

Figure 5.21b gives a timing diagram. It shows that the circuit behaves as a modulo-16 
up-counter. Because all changes take place with the same delay after the active edge of the 
Clock signal, the circuit is called a synchronous counter. 


Enable and Clear Capability 

The counters in Figures 5.19 through 5.21 change their contents in response to each 
clock pulse. Often it is desirable to be able to inhibit counting, so that the count remains 
in its present state. This may be accomplished by including an Enable control signal, as 
indicated in Figure 5.22. The circuit is the counter of Figure 5.21, where the Enable signal 
controls directly the T input of the first flip-flop. Connecting the Enable also to the AND- 
gate chain means that if Enable = 0, then all T inputs will be equal to 0. If Enable = 1, 
then the counter operates as explained previously. 

In many applications it is necessary to start with the count equal to zero. This is easily 
achieved if the flip-flops can be cleared, as explained in Section 5.4.3. The clear inputs on 
all flip-flops can be tied together and driven by a Clear_n control input. 


Synchronous Counter with D Flip-Flops 

While the toggle feature makes T flip-flops a natural choice for the implementation 
of counters, it is also possible to build counters using other types of flip-flops. The JK 
flip-flops can be used in exactly the same way as the T flip-flops because if the J and K 
inputs are tied together, a JK flip-flop becomes a T flip-flop. We will now consider using D 
flip-flops for this purpose. 

It is not obvious how D flip-flops can be used to implement a counter. We will present a 
formal method for deriving such circuits in Chapter 6. Here we will present a circuit structure 
that meets the requirements but will leave the derivation for Chapter 6. Figure 5.23 gives 
a four-bit up-counter that counts in the sequence 0, 1, 2,..., 14, 15,0, 1, and so on. The 
count is indicated by the flip-flop outputs Q,;Q,Q Qo. If we assume that Enable = 1, then 
the D inputs of the flip-flops are defined by the expressions 


Dy = Q €@1=Q 
D; = Q; BQ) 
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Enable 


Clock 


Figure 5.23 A four-bit counter with D flip-flops. 


Dz = Qo BQ, Qo 
D3 = Q3 BQ; Qo 


For a larger counter the ith stage is defined by 


D; = Q; 8 Q)_-1Qi_-a + + QUQo 


We will show how to derive these equations in Chapter 6. 
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We have included the Enable control signal in Figure 5.23 so that the counter counts the 
clock pulses only if Enable = 1. In effect, the above equations are modified to implement 
the circuit in the figure as follows 


Do = Qo ® Enable 

D, = Q; ® Q - Enable 

Dz = Qo @ Q, « Qy - Enable 

D3 = Q; ® Qs - Q; - Qo : Enable 


The operation of the counter is based on our observation for Table 5.1 that the state of the 
flip-flop in stage i changes only if all preceding flip-flops are in the state Q = 1. This makes 
the output of the AND gate that feeds stage i equal to 1, which causes the output of the XOR 
gate connected to D; to be equal to Q;. Otherwise, the output of the XOR gate provides 
D; = Q,, and the flip-flop remains in the same state. 

We have included an extra AND gate that produces the output Z. This signal makes it 
easy to concatenate two such counters to create a larger counter. It is also useful in appli- 
cations where it is necessary to detect the state where the count has reached its maximum 
value (all 1s) and will go to 0 in the next clock cycle. 

The counter in Figure 5.23 is essentially the same as the circuit in Figure 5.22. We 
showed in Figure 5.15a that a T flip-flop can be formed from a D flip-flop by providing the 
extra gating that gives 


D=QT+QT 
=QoeT 


Thus in each stage in Figure 5.23, the D flip-flop and the associated XOR gate implement 
the functionality of a T flip-flop. 


5.9.3 COUNTERS WITH PARALLEL LOAD 


Often it is necessary to start counting with the initial count being equal to 0. This state can 
be achieved by using the capability to clear the flip-flops as indicated in Figure 5.22. But 
sometimes it is desirable to start with a different count. To allow this mode of operation, 
a counter circuit must have some inputs through which the initial count can be loaded. 
Using the Clear and Preset inputs for this purpose is a possibility, but a better approach is 
discussed below. 

The circuit of Figure 5.23 can be modified to provide the parallel-load capability as 
shown in Figure 5.24. A two-input multiplexer is inserted before each D input. One input to 
the multiplexer is used to provide the normal counting operation. The other input is a data 
bit that can be loaded directly into the flip-flop. A control input, Load, is used to choose the 
mode of operation. The circuit counts when Load = 0. A new initial value, D3D2D,Do, is 
loaded into the counter when Load = 1. 
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Enable 
Do 


Load 
Clock 


Figure 5.24 — A counter with parallel-load capability. 
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5.10 RESET SYNCHRONIZATION 


We have already mentioned that it is important to be able to clear, or reset, the contents 
of a counter prior to commencing a counting operation. This can be done using the clear 
capability of the individual flip-flops. But we may also be interested in resetting the count to 
0 during the normal counting process. An n-bit up-counter functions naturally as a modulo- 
2” counter. Suppose that we wish to have a counter that counts modulo some base that is 
not a power of 2. For example, we may want to design a modulo-6 counter, for which the 
counting sequence is 0, 1, 2, 3, 4,5, 0, 1, and so on. 

The most straightforward approach is to recognize when the count reaches 5 and then 
reset the counter. An AND gate can be used to detect the occurrence of the count of 5. 
Actually, it is sufficient to ascertain that Q; = Qy = 1, which is true only for 5 in our 
desired counting sequence. A circuit based on this approach is given in Figure 5.25a. It 


Enable 


(a) Circuit 
Clock 
Qj 
Q 
Q 
Count 0 1 2 3 4 5 0 1 


(b) Timing diagram 


Figure 5.25 = A modulo-6 counter with synchronous reset. 
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uses a three-bit synchronous counter of the type depicted in Figure 5.24. The parallel-load 
feature of the counter is used to reset its contents when the count reaches 5. The resetting 
action takes place at the positive clock edge after the count has reached 5. It involves 
loading D2D,Do = 000 into the flip-flops. As seen in the timing diagram in Figure 5.25), 
the desired counting sequence is achieved, with each value of the count being established 
for one full clock cycle. Because the counter is reset on the active edge of the clock, we 
say that this type of counter has a synchronous reset. 

Consider now the possibility of using the clear feature of individual flip-flops, rather 
than the parallel-load approach. The circuit in Figure 5.26a illustrates one possibility. It 
uses the counter structure of Figure 5.21a. Since the clear inputs are active when low, a 
NAND gate is used to detect the occurrence of the count of 5 and cause the clearing of all 
three flip-flops. Conceptually, this seems to work fine, but closer examination reveals a 
potential problem. The timing diagram for this circuit is given in Figure 5.26b. It shows a 
difficulty that arises when the count is equal to 5. As soon as the count reaches this value, 
the NAND gate triggers the resetting action. The flip-flops are cleared to 0 a short time after 
the NAND gate has detected the count of 5. This time depends on the gate delays in the 
circuit, but not on the clock. Therefore, signal values Q,Q;Qy) = 101 are maintained for a 


(a) Circuit 


Clock 
Qo 
Q, 


Q 
Count 0 1 2 3 4 5 0 1 2 
(b) Timing diagram 


Figure 5.26 = A modulo-6 counter with asynchronous reset. 
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time that is much less than a clock cycle. Depending on a particular application of such a 
counter, this may be adequate, but it may also be completely unacceptable. For example, if 
the counter is used in a digital system where all operations in the system are synchronized 
by the same clock, then this narrow pulse denoting Count = 5 would not be seen by the rest 
of the system. This is not a good way of designing circuits, because pulses of short length 
often cause unforeseen difficulties in practice. The approach employed in Figure 5.26a is 
said to use asynchronous reset. 

The timing diagrams in Figures 5.25b and 5.26b suggest that synchronous reset is a 
better choice than asynchronous reset. The same observation is true if the natural counting 
sequence has to be broken by loading some value other than zero. The new value of the 
count can be established cleanly using the parallel-load feature. The alternative of using 
the clear and preset capability of individual flip-flops to set their states to reflect the desired 
count has the same problems as discussed in conjunction with the asynchronous reset. 


5.11 OTHER TYPES OF COUNTERS 


In this section we discuss three other types of counters that can be found in practical 
applications. The first uses the decimal counting sequence, and the other two generate 
sequences of codes that do not represent binary numbers. 


5.11.1 BCD COUNTER 


Binary-coded-decimal (BCD) counters can be designed using the approach explained in 
Section 5.10. A two-digit BCD counter is presented in Figure 5.27. It consists of two 
modulo-10 counters, one for each BCD digit, which we implemented using the parallel- 
load four-bit counter of Figure 5.24. Note that in a modulo-10 counter it is necessary to 
reset the four flip-flops after the count of 9 has been obtained. Thus the Load input to each 
stage is equal to | when Q; = Qy = |, which causes Os to be loaded into the flip-flops at 
the next positive edge of the clock signal. Whenever the count in stage 0, BCDo, reaches 9 
it is necessary to enable the second stage so that it will be incremented when the next clock 
pulse arrives. This is accomplished by keeping the Enable signal for BCD, low at all times 
except when BCDp = 9. 

In practice, it has to be possible to clear the contents of the counter by activating some 
control signal. Two OR gates are included in the circuit for this purpose. The control input 
Clear can be used to load Os into the counter. Observe that in this case Clear is active when 
high. 


5.11.2 RING COUNTER 


In the preceding counters the count is indicated by the state of the flip-flops in the counter. 
In all cases the count is a binary number. Using such counters, if an action is to be taken 
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Enable 


BCD, 


Clock 


Clear 


BCD, 


Figure 5.27 ~— A two-digit BCD counter. 


as a result of a particular count, then it is necessary to detect the occurrence of this count. 
This may be done using AND gates, as illustrated in Figures 5.25 through 5.27. 

It is possible to devise a counterlike circuit in which each flip-flop reaches the state 
Q; = 1 for exactly one count, while for all other counts Q; = 0. Then Q; indicates directly 
an occurrence of the corresponding count. Actually, since this does not represent binary 
numbers, it is better to say that the outputs of the flips-flops represent a code. Such a circuit 
can be constructed from a simple shift register, as indicated in Figure 5.28a. The Q output 
of the last stage in the shift register is fed back as the input to the first stage, which creates 
a ringlike structure. If a single | is injected into the ring, this 1 will be shifted through 
the ring at successive clock cycles. For example, in a four-bit structure, the possible codes 
QoQ, QQ; will be 1000, 0100, 0010, and 0001. As we said in Section 4.2, such encoding, 
where there is a single | and the rest of the code variables are 0, is called a one-hot code. 

The circuit in Figure 5.28a is referred to as a ring counter. Its operation has to be 
initialized by injecting a | into the first stage. This is achieved by using the Start control 
signal, which presets the left-most flip-flop to | and clears the others to 0. We assume that 
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Start 


Clock 


(a) An n-bit ring counter 


Qy Q; Q Q3 
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2-to-4 decoder 
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Clock 
Two-bit up-counter 


Start 


(b) A four-bit ring counter 


Figure 5.28 Ring counter. 


all changes in the value of the Start signal occur shortly after an active clock edge so that 
the flip-flop timing parameters are not violated. 

The circuit in Figure 5.28a can be used to build a ring counter with any number of 
bits, n. For the specific case of n = 4, part (b) of the figure shows how a ring counter 
can be constructed using a two-bit up-counter and a decoder. When Srart is set to 1, the 
counter is reset to 00. After Start changes back to 0, the counter increments its value in the 
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normal way. The 2-to-4 decoder, described in Section 4.2, changes the counter output into 
a one-hot code. For the count values 00, 01, 10, 11, 00, and so on, the decoder produces 
Q9Q,Q,Q; = 1000, 0100, 0010, 0001, 1000, and so on. This circuit structure can be used 
for larger ring counters, as long as the number of bits is a power of two. 


5.11.3 JOHNSON COUNTER 


An interesting variation of the ring counter is obtained if, instead of the Q output, we take 
the Q output of the last stage and feed it back to the first stage, as shown in Figure 5.29. This 
circuit is known as a Johnson counter. An n-bit counter of this type generates a counting 
sequence of length 2n. For example, a four-bit counter produces the sequence 0000, 1000, 
1100, 1110, 1111, 0111, 0011, 0001, 0000, and so on. Note that in this sequence, only a 
single bit has a different value for two consecutive codes. 

To initialize the operation of the Johnson counter, it is necessary to reset all flip-flops, 
as shown in the figure. Observe that neither the Johnson nor the ring counter will generate 
the desired counting sequence if not initialized properly. 


5.11.4 REMARKS ON COUNTER DESIGN 


The sequential circuits presented in this chapter, namely, registers and counters, have a 
regular structure that allows the circuits to be designed using an intuitive approach. In 
Chapter 6 we will present a more formal approach to design of sequential circuits and show 
how the circuits presented in this chapter can be derived using this approach. 


Qo Q, Q,-1 


Reset_n 
Clock 


Figure 5.29 = Johnson counter. 
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5.12 UsINnG STORAGE ELEMENTS WITH CAD TooLs 


This section shows how circuits with storage elements can be designed using either schematic 
capture or Verilog code. 


5.12.1 INCLUDING STORAGE ELEMENTS IN SCHEMATICS 


One way to create a circuit is to draw a schematic that builds latches and flip-flops from 
logic gates. Because these storage elements are used in many applications, most CAD 
systems provide them as prebuilt modules. Figure 5.30 shows a schematic created with 
a schematic capture tool, which includes three types of flip-flops that are imported from 
a library provided as part of the CAD system. The top element is a gated D latch, the 
middle element is a positive-edge-triggered D flip-flop, and the bottom one is a positive- 
edge-triggered T flip-flop. The D and T flip-flops have asynchronous, active-low clear and 
preset inputs. If these inputs are not connected in a schematic, then the CAD tool makes 
them inactive by assigning the default value of | to them. 

When the gated D latch is synthesized for implementation in a chip, the CAD tool may 
not generate the cross-coupled NOR or NAND gates shown in Section 5.2. In some chips 
the AND-OR circuit depicted in Figure 5.31 may be preferable. This circuit is functionally 
equivalent to the cross-coupled version in Section 5.2. One aspect of this circuit should be 
mentioned. From the functional point of view, it appears that the circuit can be simplified 
by removing the AND gate with the inputs Data and Latch. Without this gate, the top 


[> Fliptlop 


Figure 5.30 Three types of storage elements in a schematic. 
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Data 
Clock 


Latch 


Figure 5.31 — Gated D latch generated by CAD tools. 


AND gate sets the value stored in the latch when the clock is 1, and the bottom AND gate 
maintains the stored value when the clock is 0. But without this gate, the circuit has a 
timing problem known as a static hazard. A detailed explanation of hazards will be given 
in Chapter 11. 

The circuit in Figure 5.30 can be implemented in a CPLD as shown in Figure 5.32. 
The D and T flip-flops are realized using the flip-flops on the chip that are configurable as 
either D or T types. The figure depicts in blue the gates and wires needed to implement the 
circuit in Figure 5.30. 

The results of a timing simulation for the implementation in Figure 5.32 are given in 
Figure 5.33. The Latch signal, which is the output of the gated D latch implemented as 
indicated in Figure 5.31, follows the Data input whenever the Clock signal is 1. Because 
of propagation delays in the chip, the Latch signal is delayed in time with respect to the 
Data signal. Since the Flipflop signal is the output of the D flip-flop, it changes only after 
a positive clock edge. Similarly, the output of the T flip-flop, called Toggle in the figure, 
toggles when Data = | and a positive clock edge occurs. The timing diagram illustrates 
the delay from when the positive clock edge occurs at the input pin of the chip until a 
change in the flip-flop output appears at the output pin of the chip. This time is called the 
clock-to-output time, teo. 


5.12.2 USING VERILOG CONSTRUCTS FOR STORAGE ELEMENTS 


In Section 4.6 we described a number of Verilog constructs. We now show how these 
constructs can be used to describe storage elements. 

Asimple way of specifying a storage element is by using the if-else statement to describe 
the desired behavior responding to changes in the levels of data and clock inputs. Consider 
the always block 


always @(Control, B) 
if (Control) 
A=B; 
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Figure 5.32 —_ Implementation of the schematic in Figure 5.30 in a CPLD. 


where A is a variable of reg type. This code specifies that the value of A should be made 
equal to the value of B when Control = 1. But the statement does not indicate an action that 
should occur when Control = 0. In the absence of an assigned value, the Verilog compiler 
assumes that the value of A caused by the if statement must be maintained when Control 
is not equal to 1. This notion of implied memory is realized by instantiating a latch in the 
circuit. 
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Figure 5.33 Timing simulation for the storage elements in Figure 5.30. 


module D_latch (D, Clk, Q); 
input D, Clk; 
output reg Q; 


always @(D, Clk) 
if (Clk) 
Q=D; 


endmodule 


Figure 5.34 = Code for a gated D latch. 
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CODE FOR A GATED DLATCH The code in Figure 5.34 defines a module named D_latch, 
which has the inputs D and C/k and the output Q. The if clause defines that the Q output must 
take the value of D when Clk = 1. Since no else clause is given, a latch will be synthesized 
to maintain the value of Q when Clik = 0. Therefore, the code describes a gated D latch. 
The sensitivity list includes Clk and D because both of these signals can cause a change in 
the value of the Q output. 


Example 5.1 


An always construct is used to define a circuit that responds to changes in the signals 
that appear in the sensitivity list. While in the examples presented so far the always blocks 
are sensitive to the levels of signals, it is also possible to specify that a response should take 
place only at a particular edge of a signal. The desired edge is specified by using the Verilog 
keywords posedge and negedge, which are used to implement edge-triggered circuits. 


CODE FORADFLIP-FLOP Figure 5.35 defines amodule named flipflop, which is a positive- 
edge-triggered D flip-flop. The sensitivity list contains only the clock signal because it is 
the only signal that can cause a change in the Q output. The keyword posedge specifies 


Example 5.2 
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module flipflop (D, Clock, Q); 
input D, Clock; 
output reg Q; 


always @(posedge Clock) 
Q=D; 


endmodule 


Figure 5.35 = Code for aD flip-flop. 


that a change may occur only on the positive edge of Clock. At this time the output Q 
is set to the value of the input D. Since posedge appears in the sensitivity list, Q will be 
implemented as the output of a flip-flop. 


5.12.3 BLOCKING AND NON-BLOCKING ASSIGNMENTS 


In all our Verilog examples presented so far we have used the equal sign for assignments, 
as in 


f=xl & x2; 
or 
C=A+B; 
or 
Q=D; 


This notation is called a blocking assignment. A Verilog compiler evaluates the statements 
in an always block in the order in which they are written. If a variable is given a value by 
a blocking assignment statement, then this new value is used in evaluating all subsequent 
statements in the block. 


Example 5.3 


Consider the code in Figure 5.36. Since the always block is sensitive to the positive clock 
edge, both Q1 and Q2 will be implemented as the outputs of D flip-flops. However, because 
blocking assignments are involved, these two flip-flops will not be connected in cascade, 
as the reader might expect. The first statement 
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module example5_3 (D, Clock, Q1, Q2); 
input D, Clock; 
output reg QI, Q2; 


always @(posedge Clock) 


begin 
Ql =D; 
Q2=Q1; 
end 
endmodule 


Figure 5.36 _ Incorrect code for two cascaded flip-flops. 


i : 
Clock > 


Q 
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Figure 5.37 — Circuit for Example 5.3. 


sets QI to the value of D. This new value is used in evaluating the subsequent statement 
Q2=Ql; 


which results in Q2 = QI = D. The synthesized circuit has two parallel flip-flops, as illus- 
trated in Figure 5.37. A synthesis tool will likely delete one of these redundant flip-flops as 
an optimization step. 
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Verilog also provides a non-blocking assignment, denoted with <=. All non-blocking 
assignment statements in an always block are evaluated using the values that the variables 
have when the always block is entered. Thus, a given variable has the same value for all 
statements in the block. The meaning of non-blocking is that the result of each assignment 
is not seen until the end of the always block. 
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Example 5.4 


Figure 5.38 gives the same code as in Figure 5.36, but using non-blocking assignments. In 
the two statements 


QI <=D; 
Q2 <= QI; 


the variables Q1 and Q2 have some value at the start of evaluating the always block, and 
then they change to a new value concurrently at the end of the always block. This code 
generates a cascaded connection between flip-flops, which implements the shift register 
depicted in Figure 5.39. 

The differences between blocking and non-blocking assignments are illustrated further 
by the following two examples. 


Example 5.5 


Code that involves some gates in addition to flip-flops is defined in Figure 5.40 using block- 
ing assignment statements. The resulting circuit is given in Figure 5.41. Both fand g are 
implemented as the outputs of D flip-flops, because the sensitivity list of the always block 
specifies the event posedge Clock. Since blocking assignments are used, the updated value 
of f generated by the statement f = x1 & x2 has to be seen immediately by the following 
statement g = f | x3. Thus, the AND gate that produces x1 & x2 is connected to the OR 
gate that feeds the g flip-flop, as shown in Figure 5.41. 


Example 5.6 


If non-blocking assignments are used, as given in Figure 5.42, then both/ and g are updated 
simultaneously. Hence, the previous value of f is used in updating the value of g, which 
means that the output of the flip-flop that generates f is connected to the OR gate that feeds 
the g flip-flop. This gives rise to the circuit in Figure 5.43. 


module example5_4 (D, Clock, Q1, Q2); 
input D, Clock; 
output reg Q1, Q2; 


always @(posedge Clock) 


begin 
Q1<=D; 
Q2<=QI; 
end 
endmodule 


Figure 5.38 = Code for two cascaded flip-flops. 
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Clock 


Figure 5.39 — Circuit defined in Figure 5.38. 


module example5_5 (x1, x2, x3, Clock, f, g); 
input x1, x2, x3, Clock; 
output reg f, g; 


always @(posedge Clock) 
begin 

f=xl & x2; 

g=f| x3; 
end 


endmodule 


Figure 5.40 = Code for Example 5.5. 
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Figure 5.41 = Circuit for Example 5.5. 
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module example5_6 (x1, x2, x3, Clock, f, g); 
input x1, x2, x3, Clock; 
output reg f, g; 


always @(posedge Clock) 
begin 

f<=xl & x2; 

goat | x3: 
end 


endmodule 


Figure 5.42 = Code for Example 5.6. 
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Figure 5.43 — Circuit for Example 5.6. 


It is interesting to consider what circuit would be synthesized if the statements that 
specify f and g were reversed. For the code in Figure 5.40 the impact would be significant. 
If g is evaluated first, then the second statement does not depend on the first one, because f 
does not depend on g. The resulting circuit would be the same as the one in Figure 5.43. In 
contrast, reversing the statement order would make no difference for the code in Figure 5.42, 
in which the non-blocking assignment is used. 

The use of blocking assignments for sequential circuits can easily lead to wrong results, 
as demonstrated in Figure 5.37. The dependence on ordering of blocking assignments is 
dangerous, as shown in the previous example. For this reason, only non-blocking assign- 
ments should be used to describe sequential circuits. 


5.12 USING STORAGE ELEMENTS WITH CAD TooLs 


5.12.4 Non-BLOCKING ASSIGNMENTS FOR COMBINATIONAL 
CIRCUITS 


A natural question at this point is whether non-blocking assignments can be used for combi- 
national circuits. The answer is that they can be used in most situations, but when subsequent 
assignments in an always block depend on the results of previous assignments, the non- 
blocking assignments can generate nonsensical circuits. As an example, assume that we 
have a three-bit vector A = d2a;a9, and we wish to generate a combinational function f 
that is equal to | when there are two adjacent bits in A that have the value |. One way to 
specify this function with blocking assignments is 


always @(A) 
begin 
f=A[1] & A[0]; 
f=f| (A[2] & ALI); 
end 


These statements produce the desired logic function, which is f = a;ag + a2a,. Consider 
now changing the code to use the non-blocking assignments 


f <=A[1] & A[0]; 
f <=f| (A[2] & A[1]); 


There are two key aspects of the Verilog semantics relevant to this code: 


1. The results of non-blocking assignments are visible only after all of the statements in 
the always block have been evaluated. 


2. When there are multiple assignments to the same variable inside an always block, the 
result of the last assignment is maintained. 


In this example, f has an unspecified initial value when we enter the always block. The 
first statement assigns f = ajdpo, but this result is not visible to the second statement. It still 
sees the original unspecified value of f. The second assignment overrides (deletes!) the 
first assignment and produces the logic function f = f + aa. This expression does not 
correspond to a combinational circuit, because it represents an AND-OR circuit in which 
the OR-gate is fed back to itself. It is best to use blocking assignments when describing 
combinational circuits, so as to avoid accidentally creating a sequential circuit. 


5.12.5 Fiie-FLOPS WITH CLEAR CAPABILITY 


By using a particular sensitivity list and a specific style of if-else statement, it is possible 
to include clear (or preset) signals on flip-flops. 


293 


294 


CHAPTER 5 e¢ _ FLip-FLOPS, REGISTERS, AND COUNTERS 


Example 5.7 


ASYNCHRONOUS CLEAR Figure 5.44 gives a module that defines a D flip-flop with an 
asynchronous active-low reset (clear) input. When Resetn, the reset input, is equal to 0, the 
flip-flop’s Q output is set to 0. Note that the sensitivity list specifies the negative edge of 
Resetn as an event trigger along with the positive edge of the clock. We cannot omit the 
keyword negedge because the sensitivity list cannot have both edge-triggered and level- 
sensitive signals. 


Example 5.8 


SYNCHRONOUS CLEAR Figure 5.45 shows how a D flip-flop with a synchronous reset 
input can be described. In this case the reset signal is acted upon only when a positive 
clock edge arrives. This code generates the circuit in Figure 5.13c, which has an AND gate 
connected to the flip-flop’s D input. 


module flipflop (D, Clock, Resetn, Q); 
input D, Clock, Resetn; 
output reg Q; 


always @(negedge Resetn, posedge Clock) 
if (!Resetn) 


Q<=0; 
else 
Q < = D; 
endmodule 


Figure 5.44 _D flip-flop with asynchronous reset. 


module flipflop (D, Clock, Resetn, Q); 
input D, Clock, Resetn; 
output reg Q; 


always @(posedge Clock) 
if (!Resetn) 


Q<=0; 
else 
Q<=D,; 
endmodule 


Figure 5.45 _D flip-flop with synchronous reset. 
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5.13 USING VERILOG CONSTRUCTS FOR REGISTERS AND 
COUNTERS 


In this section we show how registers and counters can be specified in Verilog code. Fig- 
ure 5.44 gives code for a D flip-flop. One way to describe an n-bit register is to write 
hierarchical code that includes n instances of the D flip-flop subcircuit. A simpler approach 
is to use the same code as in Figure 5.44 and define the D input and Q output as multibit 
signals. 


AN N-BIT REGISTER Since registers of different sizes are often needed in logic circuits, it 
is advantageous to define a register module for which the number of flip-flops can be easily 
changed. The code for an n-bit register is given in Figure 5.46. The parameter n specifies 
the number of flip-flops in the register. By changing this parameter, the code can represent 
a register of any size. 


A FOUR-BIT SHIFT REGISTER Assume that we wish to write Verilog code that represents the 
four-bit parallel-access shift register in Figure 5.18. One approach is to write hierarchical 
code that uses four subcircuits. Each subcircuit consists of a D flip-flop with a 2-to-1 
multiplexer connected to the D input. Figure 5.47 defines the module named muxdff, which 
represents this subcircuit. The two data inputs are named Do and Dj, and they are selected 
using the Se/ input. The if-else statement specifies that on the positive clock edge if Sel 
= 0, then Q is assigned the value of Do; otherwise, Q is assigned the value of D,. An 
alternative way of defining the same circuit is given in Figure 5.48. In this code, the 
conditional assignment statement specifies a 2-to-1 multiplexer with the output D, which 
is then connected to the flip-flop in the always block. 

Figure 5.49 defines the four-bit shift register. The module Stage3 instantiates the left- 
most flip-flop, which has the output Q;, and the module Stage0 instantiates the right-most 
flip-flop, Qy. When L = 1, the register is loaded in parallel from the R input; and when L 
= 0, shifting takes place in the left to right direction. Serial data is shifted into the most- 
significant bit, Q3, from the w input. 


Example 5.9 


Example 5.10 


ALTERNATIVE CODE FOR A FOUR-BIT SHIFT REGISTER A different style of code for the 
four-bit shift register is given in Figure 5.50. Instead of using subcircuits, the shift register is 
defined using the approach presented in Example 5.4. All actions take place at the positive 
edge of the clock. If L = 1, the register is loaded in parallel with the four bits of input R. 
If L = 0, the contents of the register are shifted to the right and the value of the input w is 
loaded into the most-significant bit Q3. 


Example 5.11 
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module regn (D, Clock, Resetn, Q); 
parameter n = 16; 
input [n—1:0] D; 
input Clock, Resetn; 
output reg [n—1:0] Q; 


always @(negedge Resetn, posedge Clock) 
if (!Resetn) 
Q<=0; 
else 
Q<=D; 


endmodule 


Figure 5.46 = Code for an n-bit register with asynchronous clear. 


module muxdff (DO, D1, Sel, Clock, Q); 
input DO, D1, Sel, Clock; 
output reg Q; 


always @(posedge Clock) 
if (!Sel) 
Q<=D0; 
else 
Q<=DI1; 


endmodule 


Figure 5.47 = Code for a D flip-flop with a 2-to-1 multiplexer on 
the D input. 


module muxdff (DO, D1, Sel, Clock, Q); 
input DO, D1, Sel, Clock; 
output reg Q; 


wire D; 
assign D = Sel ? D1 : DO; 


always @(posedge Clock) 
Q<=D; 


endmodule 


Figure 5.48 Alternative code for a D flip-flop with a 2-to-1 
multiplexer on the D input. 
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module shift4 (R, L, w, Clock, Q); 
input [3:0] R; 
input L, w, Clock; 
output wire [3:0] Q; 


muxdff Stage3 (w, R[3], L, Clock, Q[3]); 

muxdff Stage2 (Q[3], R[2], L, Clock, Q[2]); 
muxdff Stage! (Q[2], R[1], L, Clock, Q[1]); 
muxdff StageO (Q[1], R[0], L, Clock, Q[0]); 


endmodule 


Figure 5.49 —_ Hierarchical code for a four-bit shift register. 


module shift4 (R, L, w, Clock, Q); 
input [3:0] R; 
input L, w, Clock; 
output reg [3:0] Q; 


always @(posedge Clock) 


if (L) 
Q<=R; 
else 
begin 
Ql0] <= QUI); 
Ql] <= Q(2]; 
Ql2] <= Q{3); 
QI3] <= w; 
end 


endmodule 


Figure 5.50 = Alternative code for a four-bit shift register. 


AN N-BIT SHIFT REGISTER Figure 5.51 shows the code that can be used to represent shift Example 5.12 
registers of any size. The parameter n, which has the default value 16 in the figure, sets 

the number of flip-flops. The code is identical to that in Figure 5.50 with two exceptions. 

First, R and Q are defined in terms of n. Second, the else clause that describes the shifting 

operation is generalized to work for any number of flip-flops by using a for loop. 
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module shiftn (R, L, w, Clock, Q); 
parameter n = 16; 
input [n—1:0] R; 
input L, w, Clock; 
output reg [n—1:0] Q; 
integer k; 


always @(posedge Clock) 
if (L) 
Q<=R; 
else 
begin 
for (k =0;k < n—-1;k=k+1) 
Q[k] < = Q[k+1]; 
Q[n-1] <=w; 
end 


endmodule 


Figure 5.51 = An r-bit shift register. 


module upcount (Resetn, Clock, E, Q); 
input Resetn, Clock, E; 
output reg [3:0] Q; 


always @(negedge Resetn, posedge Clock) 
if (!Resetn) 


Q<=0; 
else if (E) 
Q<=Q+t1; 
endmodule 


Figure 5.52 = Code for a four-bit up-counter. 


Example 5.13 UP-COUNTER Figure 5.52 represents a four-bit up-counter with a reset input, Resetn, and 
an enable input, E. The outputs of the flip-flops in the counter are represented by the vector 
named Q. The if statement specifies an asynchronous reset of the counter if Resetn = 0. 
The else if clause specifies that if E = 1 the count is incremented on the positive clock edge. 
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UP-COUNTER WITH PARALLEL LOAD The code in Figure 5.53 defines an up-counter that 
has a parallel-load input in addition to a reset input. The parallel data is provided as the 
input vector R. The first if statement provides the same asynchronous reset as in Figure 
5.52. The else if clause specifies that if L = 1 the flip-flops in the counter are loaded in 
parallel from the R inputs on the positive clock edge. If L = 0, the count is incremented, 
under control of the enable input E. 


Example 5.14 


DOWN-COUNTER WITH PARALLELLOAD _ Figure 5.54 shows the code for a down-counter 
named downcount. A down-counter is normally used by loading it with some starting count 
and then decrementing its contents. The starting count is represented in the code by the 
vector R. On the positive clock edge, if L = | the counter is loaded with the input R, and 
if L = 0 the count is decremented, under control of the enable input E. 


Example 5.15 


UP/DOWN COUNTER _ Verilog code for an up/down counter is given in Figure 5.55. This 
module combines the capabilities of the counters defined in Figures 5.53 and 5.54. It 
includes a control signal up_down that governs the direction of counting. 


Example 5.16 


module upcount (R, Resetn, Clock, E, L, Q); 
input [3:0] R; 
input Resetn, Clock, E, L; 
output reg [3:0] Q; 


always @(negedge Resetn, posedge Clock) 


if (!Resetn) 
Q<=0; 
else if (L) 
Q<=R; 
else if (E) 
Q<=Q+1; 
endmodule 


Figure 5.53 A four-bit up-counter with a parallel load. 
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module downcount (R, Clock, E, L, Q); 
parameter n = 8; 
input [n—1:0] R; 
input Clock, L, E; 
output reg [n-—1:0] Q; 


always @(posedge Clock) 


if (L) 
Q<=R; 
else if (E) 
Q<=Q-1; 
endmodule 


Figure 5.54 =A down-counter with a parallel load. 


module updowncount (R, Clock, L, E, up_down, Q); 
parameter n = 8; 
input [n—1:0] R; 
input Clock, L, E, up_down; 
output reg [n—1:0] Q; 


always @(posedge Clock) 


if (L) 
Q<=R; 
else if (E) 


Q<=Q+(up_down ? |: -1); 
endmodule 


Figure 5.55 = Code for an up/down counter. 


5.13.1 FLir-FLOPS AND REGISTERS WITH ENABLE INPUTS 


We showed in Figures 5.22 and 5.23 how an enable input can be used in counter circuits 
to be able to prevent the flip-flops from toggling when an active clock edge occurs. It is 
also useful in many other types of circuits to be able to prevent the data stored in flip-flops 
from changing when an active clock edge occurs. For D flip-flops, this capability can be 
provided by adding a multiplexer to the flip-flop, as shown in Figure 5.56a. When E = 0, 
the flip-flop output cannot change, because the multiplexer connects Q to the input of the 
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(a) Using a multiplexer (b) Clock gating 


Figure 5.56 Providing an enable input for a D flip-flop. 


module rege (D, Clock, Resetn, E, Q); 
input D, Clock, Resetn, E; 
output reg Q; 


always @(posedge Clock, negedge Resetn) 
if (Resetn == 0) 


Q<=0; 
else if (E) 
Q<=D; 
endmodule 


Figure 5.57 = Code for aD flip-flop with enable. 


flip-flop. But if E = 1, then the multiplexer allows new data to be loaded into the flip-flop 
from the D input. Instead of using the multiplexer shown in the figure, another way to 
implement the enable feature is to use a two-input AND gate as illustrated in Figure 5.56. 
Then setting E = 0 prevents the clock signal from reaching the flip-flop’s clock input. This 
method seems simpler than the multiplexer approach, but we will show in Section 5.15 that 
it can cause problems in practical operation. We will prefer the multiplexer-based approach 
over gating the clock with an AND gate. 

Verilog code for a D flip-flop with an asynchronous reset input and an enable input is 
given in Figure 5.57. After first specifying the reset condition, the always block uses an 
else if clause to ensure that the data stored in the flip-flop can change only when E = 1. 
We can extend the enable capability to registers with n bits by using n 2-to-1 multiplexers 
controlled by F, as shown in Figure 5.58. The multiplexer for each flip-flop, i, selects either 
the external data bit, R;, or the flip-flop’s output, Q;. 
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module regne (R, Clock, Resetn, E, Q); 
parameter n= 8; 
input [n—1:0] R; 
input Clock, Resetn, E; 
output reg [n—1:0] Q; 


always @(posedge Clock, negedge Resetn) 
if (Resetn == 0) 
Q<=0; 
else if (E) 
Q<=R; 


endmodule 


Figure 5.58 = Anu-bit register with an enable input. 


5.13.2 SHirtT REGISTERS WITH ENABLE INPUTS 


It is useful to be able to inhibit the shifting operation in a shift register by using an enable 
input, E. We showed in Figure 5.18 that shift registers can be constructed with a parallel- 
load capability, which is implemented using a multiplexer. Figure 5.59 shows how the 
enable feature can be added by using an additional multiplexer. If the parallel-load control 
input, L, is 1, the flip-flops are loaded in parallel. If L = 0, the additional multiplexer selects 
new data to be loaded into the flip-flops only if the enable E is 1. 

Verilog code that represents the circuit in Figure 5.59 is given in Figure 5.60. When 
L = 1, the register is loaded in parallel from the R input. When L = 0 and E = 1, the data 
in the shift register is shifted in a left-to-right direction. 


5.14 DESIGN EXAMPLE 


This section presents an example of a digital system that makes use of some of the building 
blocks described in this chapter and in Chapter 4. 


5.14.1 REACTION TIMER 


Electronic devices operate at remarkably fast speeds, with the typical delay through a logic 
gate being less than | ns. In this example we use a logic circuit to measure the speed of a 
much slower type of device—a person. 

We will design a circuit that can be used to measure the reaction time of a person to 
a specific event. The circuit turns on a small light, called a light-emitting diode (LED). In 
response to the LED being turned on, the person attempts to press a switch as quickly as 
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Figure 5.59 
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A shift register with parallel-load and enable control inputs. 
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module shiftrne (R, L, E, w, Clock, Q); 
parameter n= 4; 
input [n—-1:0] R; 
input L, E, w, Clock; 
output reg [n—1:0] Q; 
integer k; 


always @(posedge Clock) 
begin 
if (L) 
Q<=R; 
else if (E) 
begin 
Q{n- 1] <=w; 
for (k =n—2;k >= 0;k=k-1) 
Qk] <= Q[k+ 1]; 
end 
end 


endmodule 


Figure 5.60 =A left-to-right shift register with an enable input. 


possible. The circuit measures the elapsed time from when the LED is turned on until the 
switch is pressed. 

To measure the reaction time, a clock signal with an appropriate frequency is needed. 
In this example we use a 100 Hz clock, which measures time at a resolution of 1/100 of a 
second. The reaction time can then be displayed using two digits that represent fractions 
of a second from 00/100 to 99/100. 

Digital systems often include high-frequency clock signals to control various subsys- 
tems. In this case assume the existence of an input clock signal with the frequency 102.4 
kHz. From this signal we can derive the required 100 Hz signal by using a counter as a clock 
divider. A timing diagram for a four-bit counter is given in Figure 5.21. It shows that the 
least-significant bit output, Qo, of the counter is a periodic signal with half the frequency of 
the clock input. Hence we can view Q, as dividing the clock frequency by two. Similarly, 
the Q, output divides the clock frequency by four. In general, output Q; in an n-bit counter 
divides the clock frequency by 2't!. In the case of our 102.4 kHz clock signal, we can use 
a 10-bit counter, as shown in Figure 5.61b. The counter output co has the required 100 Hz 
frequency because 102400 Hz/1024 = 100 Hz. 

The reaction timer circuit has to be able to turn an LED on and off. The graphical 
symbol for an LED is shown in blue in Figure 5.61b. Small blue arrows in the symbol 
represent the light that is emitted when the LED is turned on. The LED has two terminals: 
the one on the left in the figure is the cathode, and the terminal on the right is the anode. 
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Figure 5.61 A reaction-timer circuit. 
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To turn the LED on, the cathode has to be set to a sufficiently lower voltage than the anode, 
which causes a current to flow through the LED. If the voltages on its two terminals are 
equal, the LED is off. 

Figure 5.61b shows one way to control the LED, using an inverter. If the input voltage 
Viep = 0, then the voltage at the cathode is equal to Vpp; hence the LED is off. But if 
Vizp = Vpp, the cathode voltage is 0 V and the LED is on. The amount of current that 
flows is limited by the value of the resistor R;. This current flows through the LED and 
the inverter. Since the current flows into the inverter, we say that the inverter sinks the 
current. The maximum current that a logic gate can sink without sustaining permanent 
damage is usually called Zg,, which stands for the “maximum current when the output is 
low.” The value of Ry is chosen such that the current is less than J9,. As an example 
assume that the inverter is implemented inside a chip for which the value of [oz specified 
in the data sheet for the chip is 12 mA. For Vpp = 5S V, this leads to Ry; © 450 Q because 
5 V/450 Q = 11 mA (there is actually a small voltage drop across the LED when it is turned 
on, but we ignore this for simplicity). The amount of light emitted by the LED is proportional 
to the current flow. If 11 mA is insufficient, then the inverter should be implemented in 
a driver chip, like those described in Appendix B, because drivers provide a higher value 
of I, OL: 

The complete reaction-timer circuit is illustrated in Figure 5.61c, with the inverter 
from part (b) shaded in grey. The graphical symbol for a push-button switch is shown in 
the top left of the diagram. The switch normally makes contact with the top terminals, as 
depicted in the figure. When depressed, the switch makes contact with the bottom terminals; 
when released, it automatically springs back to the top position. In the figure the switch is 
connected such that it normally produces a logic value of 1, and it produces a 0 pulse when 
pressed. 

When depressed, the push-button switch causes the D flip-flop to be synchronously 
reset. The output of this flip-flop determines whether the LED is on or off, and it also 
provides the count enable input to a two-digit BCD counter. As discussed in Section 5.11, 
each digit in a BCD counter has four bits that take the values 0000 to 1001. Thus the 
counting sequence can be viewed as decimal numbers from 00 to 99. A circuit for the 
BCD counter is given in Figure 5.27. Each digit in the counter is connected through a code 
converter to a 7-segment display, which we described in the discussion for Figure 2.63. 
In Figure 5.61c the counter is clocked by the cy output of the clock divider in part (a) of 
the figure. The intended use of the reaction-timer circuit is to first assert the Reset input to 
clear the flip-flop and thus turn off the LED and disable the counter. Asserting the Reset 
input also clears the contents of the counter to 00. The input w normally has the value 0, 
which keeps the flip-flop cleared and prevents the count value from changing. The reaction 
test is initiated by setting w = | for one cy clock cycle. After the next positive edge of the 
clock, the flip-flop output becomes a 1, which turns on the LED. We assume that w returns 
to 0 after one co clock cycle, but the flip-flop output remains at 1 because of the 2-to-1 
multiplexer connected to the D input. The counter is then incremented every 1/100 of a 
second. When the user depresses the switch, the flip-flop is cleared, which turns off the 
LED and stops the counter. The two-digit display shows the elapsed time to the nearest 
1/100 of a second from when the LED was turned on until the user was able to respond by 
depressing the switch. 


5.14 DESIGN EXAMPLE 


Verilog Code 

To describe the circuit in Figure 5.61c using Verilog code, we can make use of subcir- 
cuits for the BCD counter and the 7-segment code converter. Code for the BCD counter, 
which represents the circuit in Figure 5.27, is shown in Figure 5.62. The two-digit BCD 
output is represented by the 2 four-bit signals BCD 1 and BCDO. The Clear input provides 
a synchronous reset for both digits in the counter. If E = 1, the count value is incremented 
on the positive clock edge; and if E = 0, the count value is unchanged. Each digit can 
take the values from 0000 to 1001. Figure 5.63 gives the code for the BCD-to-7-segment 
decoder. 

Figure 5.64 shows the code for the reaction timer. The input signal Pushn represents the 
value produced by the push-button switch. The output signal LEDn represents the output 
of the inverter that is used to control the LED. The two 7-segment displays are controlled 
by the seven-bit signals Digit! and Digit 0. 

The flip-flop in Figure 5.61c is loaded with the value | if w = 1, but if w = 0 the 
stored value in the flip-flop is not changed. This circuit is described by the always block in 
Figure 5.64, which also includes a synchronous reset input; the reset is activated if either 


module BCDcount (Clock, Clear, E, BCD1, BCDO); 
input Clock, Clear, E; 
output reg [3:0] BCD1, BCDO; 


always @(posedge Clock) 
begin 
if (Clear) 
begin 
BCD1 <=0; 
BCDO <=0; 
end 
else if (E) 
if (BCDO == 4’b1001) 
begin 
BCDO <=0; 
if (BCD1 == 4’b1001) 
BCD1 <=0; 
else 
BCD1 <=BCDI1 +1; 
end 
else 
BCDO <= BCD0 + 1; 
end 


endmodule 


Figure 5.62 — Code for the two-digit BCD counter in Figure 5.27. 
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module seg7 (bcd, leds); 
input [3:0] bcd; 
output reg [1:7] leds; 


always @(bcd) 

case (bcd) //abcdefg 

0: leds = 7’b1111110; 

: leds = 7’b0110000; 
: leds = 7’b1101101; 
: leds = 7’b1111001; 
: leds = 7’b0110011; 
: leds = 7’b1011011; 
: leds = 7’b1011111; 
: leds = 7’b1110000; 
: leds =7’b1111111; 
: leds = 7’b1111011; 

default: leds = 7’bx; 
endcase 


OAANNDN PWN 


endmodule 


Figure 5.63 = Code for the BCD-to-7-segment decoder. 


module reaction (Clock, Reset, c9, w, Pushn, LEDn, Digit1, Digit0); 
input Clock, Reset, c9, w, Pushn; 
output wire LEDn; 
output wire [1:7] Digitl, Digit0; 
reg LED; 
wire [3:0] BCD1, BCDO; 


always @(posedge Clock) 
begin 
if (‘Pushn || Reset) 
LED <=0; 
else if (w) 
LED <= 1; 
end 


assign LEDn = ~LED; 

BCDcount counter (c9, Reset, LED, BCD1, BCDO); 
seg7 seg! (BCD1, Digit1); 

seg7 seg0 (BCDO, Digit0); 


endmodule 


Figure 5.64 — Code for the reaction timer. 


5.14 DESIGN EXAMPLE 
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Figure 5.65 — Simulation of the reaction timer circuit. 


Pushn = 0 or Reset = 1. We have chosen to use a synchronous reset because the flip-flop 
output is connected to the enable input E on the BCD counter. As we know from the 
discussion in Section 5.3, it is important that all signals connected to flip-flops meet the 
required setup and hold times. The push-button switch can be pressed at any time and is not 
synchronized to the co clock signal. By using a synchronous reset for the flip-flop in Figure 
5.61c, we avoid possible timing problems in the counter. Of course, the setup time of the 
flip-flop itself may become violated due to the asynchronous operation of the push-button 
switch. We show in Chapter 7 how this type of problem can be alleviated by adding extra 
flip-flops that are used to synchronize signals. 

A simulation of the reaction-timer circuit implemented in a chip is shown in Figure 
5.65. Initially, Reset is asserted to clear the the flip-flop and counter. When w changes to 1, 
the circuit sets LEDn to 0, which represents the LED being turned on. After some amount 
of time, the switch will be depressed. In the simulation we arbitrarily set Pushn to 0 after 
18 cg clock cycles. Thus this choice represents the case when the person’s reaction time is 
about 0.18 seconds. In human terms this duration is a very short time; for electronic circuits 
it is a very long time. An inexpensive personal computer can perform tens of millions of 
operations in 0.18 seconds! 


5.14.2 REGISTER TRANSFER LEVEL (RTL) CoDE 


At this point, we have introduced most of the Verilog constructs that are needed for synthesis. 
Most of our examples give behavioral code, utilizing if-else statements, case statements, for 
loops, and other procedural statements. It is possible to write behavioral code in a style that 
resembles a computer program, in which there is a complex flow of control with many loops 
and branches. With such code, sometimes called high-level behavioral code, it is difficult to 
relate the code to the final hardware implementation; it may even be difficult to predict what 
circuit a high-level synthesis tool will produce. In this book we do not use the high-level 
style of code. Instead, we present Verilog code in such a way that the code can be easily 
related to the circuit that is being described. Most design modules presented are fairly small, 
to facilitate simple descriptions. Larger designs are built by interconnecting the smaller 
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modules. This approach is usually referred to as the register-transfer level (RTL) style of 
code. It is the most popular design method used in practice. RTL code is characterized by a 
straightforward flow of control through the code; it comprises well-understood subcircuits 
that are connected together in a simple way. 


5.15 TIMING ANALYSIS OF FLIP-FLOP CIRCUITS 


In Figure 5.14 we showed the timing parameters associated with a D flip-flop. A simple 
circuit that uses this flip-flop is given in Figure 5.66. We wish to calculate the maximum 
clock frequency, Fina, for which this circuit will operate properly, and also determine if the 
circuit suffers from any hold time violations. In the literature, this type of analysis of circuits 
is usually called timing analysis. We will assume that the flip-flop timing parameters have 
the values f,,, = 0.6 ns, t, = 0.4 ns, and 0.8 ns < tg < 1.0 ns. A range of minimum and 
maximum values is given for f.q because, as we mentioned in Section 5.4.4, this is the usual 
way of dealing with variations in delay that exist in integrated circuit chips. 

To calculate the minimum period of the clock signal, Tiin = 1/Finax, we need to con- 
sider all paths in the circuit that start and end at flip-flops. In this simple circuit there is 
only one such path, which starts when data is loaded into the flip-flop by a positive clock 
edge, propagates to the Q output after the #.g delay, propagates through the NOT gate, and 
finally must meet the setup requirement at the D input. Therefore 


Lin = tQ + tvor + tu 


Since we are interested in the longest delay for this calculation, the maximum value of 
t-qg should be used. For the calculation of tyor we will assume that the delay through any 
logic gate can be calculated as | + 0.14, where k is the number of inputs to the gate. For a 
NOT gate this gives 1.1 ns, which leads to 


Tin = 1.0+ 1.1+ 0.6 = 2.7 ns 
Finax = 1/2.7 ns = 370.37 MHz 


It is also necessary to check if there are any hold time violations in the circuit. In this 
case we need to examine the shortest possible delay from a positive clock edge to a change 
in the value of the D input. The delay is given by f.q + tyor = 0.8 + 1.1 = 1.9 ns. Since 
1.9 ns > t, = 0.4 ns there is no hold time violation. 

As another example of timing analysis of flip-flop circuits, consider the counter circuit 
shown in Figure 5.67. We wish to calculate the maximum clock frequency for which this 
circuit will operate properly assuming the same flip-flop timing parameters as we did for 
Figure 5.66. We will again assume that the propagation delay through a logic gate can be 
calculated as | + 0.1k. 

There are many paths in this circuit that start and end at flip-flops. The longest such 
path starts at flip-flop Qo and ends at flip-flop Q3. The longest path in a circuit is often called 
a critical path. The delay of the critical path includes the clock-to-Q delay of flip-flop Qo, 
the propagation delay through three AND gates, and one XOR-gate delay. We must also 
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Clock 


Clear 


Figure 5.66 — A simple flip-flop circuit. 


Enable 


Clock 


Figure 5.67 ~—A 4-bit counter. 
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account for the setup time of flip-flop Q3. This gives 


Tinin = tea + 3(tanp) + tyor + tsu 
Using the maximum value of f.g gives 


Tnin = 1.0 + 3(1.2) + 1.2 + 0.6 ns = 6.4 ns 
Fnax = 1/6.4 ns = 156.25 MHz 


The shortest paths through the circuit are from each flip-flop to itself, through an XOR 
gate. The minimum delay along each such path is fog + txor = 0.8 + 1.2 = 2.0 ns. Since 
2.0 ns > t, = 0.4 ns there are no hold time violations. 


5.15.1 TiminG ANALYSIS WITH CLOCK SKEW 


In the above analysis we assumed that the clock signal arrived at exactly the same time 
at all four flip-flops. We will now repeat this analysis assuming that the clock signal still 
arrives at flip-flops Qo, Q;, and Q> simultaneously, but that there is a delay in the arrival 
of the clock signal at flip-flop Q3. Such a variation in the arrival time of a clock signal at 
different flip-flops is called clock skew, tskey, and can be caused by a number of factors. 

In Figure 5.67 the critical path through the circuit is from flip-flop Qo to Q3. However, 
the clock skew at Q3 has the effect of reducing this delay, because it provides additional 
time before data is loaded into this flip-flop. Taking a clock skew of 1.5 ns into account, the 
delay of the path from flip-flop Qo to Q3 is given by fog + 3(tanp) + txor + teu — tskew = 
6.4 — 1.5ns = 4.9ns. There is now a different critical path through the circuit, which starts 
at flip-flop Qo and ends at Q>. The delay of this path gives 


Tinin = tog + 2(tand) + txor + tou 
= 1.0 +2(1.2) + 1.24 0.6 ns 
= 5.2 ns 

Frnax = 1/5.2 ns = 192.31 MHz 


In this case the clock skew results in an increase in the circuit’s maximum clock frequency. 
But if the clock skew had been negative, which would be the case if the clock signal arrived 
earlier at flip-flop Q3 than at other flip-flops, then the result would have been a reduced 
ax: 

Since the loading of data into flip-flop Q3 is delayed by the clock skew, it has the 
effect of increasing the hold time requirement of this flip-flop to t, + tstew, for all paths 
that end at Q3 but start at Qo, Q;, or Qo. The shortest such path in the circuit is from 
flip-flop Q» to Q3 and has the delay tog + tanp + tyor = 0.8 + 1.24 1.2 = 3.2 ns. Since 
3.2 ns > ty + tskey = 1.9 ns there is no hold time violation. 

If we repeat the above hold time analysis for clock skew values ftsrey > 3.2 — th = 
2.8 ns, then hold time violations will exist. Thus, if t,,.,, > 2.8 ns the circuit will not work 
reliably at any clock frequency. 
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Consider the circuit in Figure 5.68. In this circuit, there is a path that starts at flip-flop Q), 
passes through some network of logic gates, and ends at the D input of flip-flop Qo. As 
indicated in the figure, different delays may be incurred before the clock signal reaches the 
flip-flops. Let A; and A> be the clock-signal delays for flip-flops Q; and Qo, respectively. 
The clock skew between these two flip-flops is then defined as 


lskew = Ad = A\ 


Let the longest delay along the paths through the logic gates in the circuit be t,. Then, 
the minimum allowable clock period for these two flip-flops is 


Tinin _ tQ + ty + tsu — tskew 


Thus, if A> > Aj, then t,.. allows for an increased value of Fig, but if Ar < A), then 
the clock skew requires a decrease in Fina. 

To calculate whether a hold time violation exists at flip-flop Qo we need to determine 
the delay along the shortest path between the flip-flops. If the minimum delay through the 
logic gates in the circuit is t;, then a hold time violation occurs if 


tea +t < th + tskew 


Here, the hold-time constraint is more difficult to meet if Ay — A, > 0, and itis less difficult 
to meet if A> — A, < 0. 


Example 5.17 


The techniques described above for F',,¢, and hold time analysis can be applied to any 
circuit in which the same, or related, clock signals are connected to all flip-flops. Consider 
again the reaction-timer circuit in Figure 5.61. The clock divider in part (a) of the figure 
generates the co signal, which drives the clock inputs of the flip-flops in the BCD counter. 


Clock 


Figure 5.68 — A general example of clock skew. 
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Figure 5.69 — A modified version of the reaction-timer circuit. 


Since these paths start at the Q output of the co flip-flop but end at the clock inputs of other 
flip-flops, rather than at D inputs, the above timing analysis method cannot be applied. 
However, we could restructure the reaction-timer circuit as indicated in Figure 5.69. Instead 
of using C9 directly as a clock for the BCD counter, a ten-input AND gate is used to generate 
a signal that has the value | for only one of the 1024 count values from the clock divider. 
This signal is then ANDed with the control flip-flop output to cause the BCD counter to be 
incremented at the desired rate of 100 times per second. 

In the circuit of Figure 5.69 all flip-flops are clocked directly by the Clock signal. 
Therefore, the Fj,q, and hold time analysis can now be applied. In general, it is a good 
design approach for sequential circuits to connect the clock inputs of all flip-flops to a 
common clock. We discuss such issues in detail in Chapter 7. 


| 5.16 CONCLUDING REMARKS 


In this chapter we have presented circuits that serve as basic storage elements in digital 
systems. These elements are used to build larger units such as registers, shift registers, 
and counters. Many other texts that deal with this material are available [3-10]. We 
have illustrated how circuits with flip-flops can be described using Verilog code. More 
information on Verilog can be found in [11-18]. In the next chapter a more formal method 
for designing circuits with flip-flops will be presented. 
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(b) Timing diagram 


Figure 5.70 = Circuit for Example 5.18. 


5.17 EXAMPLES OF SOLVED PROBLEMS 


This section presents some typical problems that the reader may encounter, and shows how 
such problems can be solved. 


Problem: Consider the circuit in Figure 5.70a. Assume that the input C is driven by a Example 5.18 
square wave signal with a 50% duty cycle. Draw a timing diagram that shows the waveforms 
at points A and B. Assume that the propagation delay through each gate is A seconds. 


Solution: The timing diagram is shown in Figure 5.70. 


Problem: Determine the functional behavior of the circuit in Figure 5.71. Assume that Example 5.19 
input w is driven by a square wave signal. 


Solution: When both flip-flops are cleared, their outputs are Qo = Q; = 0. After the Clear 
input goes high, each pulse on the w input will cause a change in the flip-flops as indicated 
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Clear 


Figure 5.71 — Circuit for Example 5.19. 


Time FFO FFI 


interval | fy Kg Qo| Jt Ki Qi 


Clear 1 1 0 0 1 0 
ty if 1 1 1 1 0 
to 0 1 0 0 1 1 
13 1 1 0 0 1 0 
ta 1 1 1 1 1 0 


Figure 5.72 = Summary of the behavior of the circuit in Figure 5.71. 


in Figure 5.72. Note that the figure shows the state of the signals after the changes caused 
by the rising edge of a pulse have taken place. 

In consecutive time intervals the values of Q; Qo are 00, 01, 10, 00, 01, and so on. 
Therefore, the circuit generates the counting sequence: 0, 1, 2, 0, 1, and so on. Hence, the 
circuit is a modulo-3 counter. 


Example 5.20 Problem: Design a circuit that can be used to control a vending machine. The circuit has 
five inputs: Q (quarter), D (dime), N (nickel), Coin, and Resetn. When a coin is deposited 
in the machine, a coin-sensing mechanism generates a pulse on the appropriate input (Q, 
D, or N). To signify the occurrence of the event, the mechanism also generates a pulse on 
the line Coin. The circuit is reset by using the Resetn signal (active low). When at least 
30 cents has been deposited, the circuit activates its output, Z. No change is given if the 
amount exceeds 30 cents. 
Design the required circuit by using the following components: a six-bit adder, a six-bit 
register, and any number of AND, OR, and NOT gates. 
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Register S 


Z 


Figure 5.73 — Circuit for Example 5.20. 


Solution: Figure 5.73 gives a possible circuit. The value of each coin is represented by a 
corresponding five-bit number. It is added to the current total, which is held in register S. 
The required output is 


Z = 85 + 8483528] 


The register is clocked by the negative edge of the Coin signal. This allows for a propagation 
delay through the adder, and ensures that a correct sum will be placed into the register. 

In Chapter 9 we will show how this type of control circuit can be designed using a 
more structured approach. 
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module vend (N, D, Q, Resetn, Coin, Z); 
input N, D, Q, Resetn, Coin; 
output Z; 
wire [4:0] X; 
reg [5:0] S; 
assign X[0] =N| Q; 
assign X[1] =D; 
assign X[2] =N; 
assign X[3] =D| Q; 
assign X[4] =Q; 
assign Z = S[5] | (S[4] & S[3] & S[2] & S[1]); 


always @(negedge Coin, negedge Resetn) 
if (Resetn == 1’b0) 
S <=5’b00000; 
else 
S <= {1’b0, X} +S; 


endmodule 


Figure 5.74 — Code for Example 5.21. 


Example 5.21 


Problem: Write Verilog code to implement the circuit in Figure 5.73. 


Solution: Figure 5.74 gives the desired code. 


Example 5.22 


Problem: In Section 5.15 we presented a timing analysis for the counter circuit in Figure 
5.67. Redesign this circuit to reduce the logic delay between flip-flops, so that the circuit 
can operate at a higher maximum clock frequency. 


Solution: As we showed in Section 5.15, the performance of the counter circuit is limited 
by the delay through its cascaded AND gates. To increase the circuit’s performance we 
can refactor the AND gates as illustrated in Figure 5.75. The longest delay path in this 
redesigned circuit, which starts at flip-flop Qo and ends at Q3, provides the minimum clock 
period 


Tinin = teQ + tanp + txor + tsu 
= 1.04+14+4+1.2+4+0.6ns = 4.2 ns 


The redesigned counter has a maximum clock frequency of Fina, = 1/4.2 ns = 238.1 MHz, 
compared to the result for the original counter, which was 156.25 MHz. 
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Clock 


Figure 5.75 —_A faster 4-bit counter. 


Problem: In Example 5.17 we showed how to perform timing analysis when there may be Example 5.23 
different delays associated with the clock signal at each of the flip-flops in a circuit. The 
circuit in Figure 5.76 includes three flip-flops, Q;, Qo, and Q3, with corresponding clock 
delays A;, A>, and A3. The flip-flop timing parameters are f,,, = 0.6 ns, t, = 0.4 ns, and 
0.8 < ta < Ins. Also, the delay through a logic gate is given by | + 0.1k, where k is the 
number of inputs to the gate. 
Calculate the Fig, for the circuit for the following sets of clock delays: A; = Az = 
A3 = Ons; A; = A3 = Onsand A> = 0.7 ns; A; = I ns, Ao = 0, and A3 = 0.5 ns. Also, 
determine if there are any hold time violations in the circuit for the sets of clock delays 
A; = Ao = A; = Ons, and A; = I ns, Ap = 0, A3 = 0.5 ns. 
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Clock 


Figure 5.76 — A circuit with clock skews. 


Solution: For the values A; = A, = A; = 0, there is no clock skew. Let the path delay 
between two flip-flops, Q; and Qj, be Ta,.g,. The longest path delays are calculated by 
including the maximum value of f,q at flip-flop Q;, and the setup time at flip-flop Q;, as 
follows 


TQ,->Q, = tea + tyor + tanp + tsu = 14+1.2+1.2+0.6=4ns 

TQ,+Q = teg + tand + tsu = 1+1.2+0.6 = 2.8 ns 

TQ,+Q, = ta + tnor + tsu = 1+1.1+0.6 = 2.7 ns 

TaQ,>Q, = ta + tsy = 1+ 0.6 = 1.6ns 

TQ,>Q) = tea + tyor + tanp + tsu = 14+1.2+1.2+0.6=4ns 
Since the critical path delay Tg, +9, = To,+0, =4 ns, then Ping, = 1/(4 x 10-°) = 
250 MHz. 

For the values A; = A; = 0, and A, = 0.7ns, there is clock skew for the paths between 
flip-flops Q; and Qs, as well as Q3 and Q). Adjusting the longest delays calculated above 
to account for clock skew, we have 

TQ, +Q =4— bey = 4— (Ao — Ay) = 4—0.7 = 3.3 ns 

Tg,+0, = 2.7 — tskew = 2.7 — (A3 — Az) = 2.7— (0 — 0.7) = 3.4 ns 

TQ,>05 = 4 — teow = 4— (Ao — A3) = 4—0.7 = 3.3 ns 
The critical path delay now starts at flip-flop Q» and ends at flip-flop Q3. Since Tg,--9, = 
3.4 ns, then Fing, = 1/(3.4 x 107°) = 294 MHz. 

For the values A; = 1, Ay = 0, and A3 = 0.5 ns, there is clock skew for the paths 
between all flip-flops. Adjusting the longest delays calculated for the case with no clock 
skew, we have 

Te,-a, =4— thew = 4 — (An — At) =4— O- 1) =5S5ns 

Ta,+Q, = 2.7 — tskew = 2.7 — (A3 — Az) = 2.7 — (0.5) = 2.2 ns 
TaQ,+9, = 1.6 — tekey = 1.6 — (Ai — Az) = 1.6— 1 — 0.5) = 1.1 ns 
TQ,>Q = 4 — ftykeyw = 4— (Azo — A3) = 4— (0— 0.5) = 4.5 ns 
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The critical path delay is from flip-flop Q; to flip-flop Q.. Since Tg,.9, = 5 ns, then 
Finax = 1/(5 x 10-°) = 200 MHz. 

To determine whether any hold time violations exist, we need to calculate the shortest 
path delays, using the minimum value of f.g. For A; = Az = A3 = 0, we have 


Ta,+9, = tea + txor + tanp = 0.8 + 1.2 + 1.2 = 3.2 ns 
Ta,3Q, = ta + tand = 0.84 1.2 = 2 ns 

TQ,>0; = ta + tvor = 0.8 + 1.1 = 1.9 ns 

Ta,39, = tq = 0.8 ns 

Ta,3Q, = tea + txor + tanp = 0.8 + 1.2 4+ 1.2 = 3.2 ns 


Since the shortest path delay Tg,.9, = 0.8 ns > t, there is no hold time violation in the 
circuit. 

We showed in Section 5.15 that if the shortest path delay through a circuit is called 
T;, then a hold time violation occurs if 7; — fskey < t,. Adjusting the shortest path delays 
calculated above for the values Aj; = 1, Ay = 0, and A3 = 0.5 gives 


Ta, +Q) = 3.2 — tskew = 3-2 — (Ar — Ay) = 3.2— (0-1) = 4.2 ns 
Ta, +0, = 1.9 = tstew = 1.9 — (A3 — Ar) = 1.9-0.5 = 1.4 ns 

Ta,-+Q, = 0.8 — tyke = 0.8 — (Ay — Az) = 0.8 — (1 — 0.5) = 0.3 ns 
TQ, -+Q) = 3.2 — tyke = 3-2 — (Ar — Az) = 3.2 — (0 — 0.5) = 3.7 ns 


The shortest path delay is Tg,.9, = 0.3 ns < t,, which represents a hold time violation. 
Hence, the circuit may not function reliably regardless of the frequency of the clock signal. 


PROBLEMS 


Answers to problems marked by an asterisk are given at the back of the book. 


5.1 Consider the timing diagram in Figure P5.1. Assuming that the D and Clock inputs shown 
are applied to the circuit in Figure 5.10, draw waveforms for the Q,,, Q,, and Q. signals. 


Clock 


Figure P5.1_ ~~ Timing diagram for Problem 5.1. 
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Figure 5.4 shows a latch built with NOR gates. Draw a similar latch using NAND gates. 
Derive its characteristic table and show its timing diagram. 


Show a circuit that implements the gated SR latch using NAND gates only. 


Given a 100-MHz clock signal, derive a circuit using D flip-flops to generate 50-MHz 
and 25-MHz clock signals. Draw a timing diagram for all three clock signals, assuming 
reasonable delays. 


An SR flip-flop is a flip-flop that has set and reset inputs like a gated SR latch. Show how 
an SR flip-flop can be constructed using a D flip-flop and other logic gates. 


The gated SR latch in Figure 5.5a has unpredictable behavior if the S and R inputs are 
both equal to 1 when the C/k changes to 0. One way to solve this problem is to create a 
set-dominant gated SR latch in which the condition S = R = 1| causes the latch to be set to 
1. Design a set-dominant gated SR latch and show the circuit. 


Show how a JK flip-flop can be constructed using a T flip-flop and other logic gates. 


Consider the circuit in Figure P5.2. Assume that the two NAND gates have much longer 
(about four times) propagation delay than the other gates in the circuit. How does this 
circuit compare with the circuits that we discussed in this chapter? 


A 

D 
C 

E 
B 


Figure P5.2 — Circuit for Problem 5.8. 


Write Verilog code that represents a T flip-flop with an asynchronous clear input. Use 
behavioral code, rather than structural code. 


Write Verilog code that represents a JK flip-flop. Use behavioral code, rather than structural 
code. 


5.16 
*5.17 
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Synthesize a circuit for the code written for Problem 5.10 by using your CAD tools. Simulate 
the circuit and show a timing diagram that verifies the desired functionality. 


A universal shift register can shift in both the left-to-right and right-to-left directions, and 
it has parallel-load capability. Draw a circuit for such a shift register. 


Write Verilog code for a universal shift register with n bits. 


Design a four-bit synchronous counter with parallel load. Use T flip-flops, instead of the D 
flip-flops used in Section 5.9.3. 


Design a three-bit up/down counter using T flip-flops. It should include a control input 
called Up/Down. If Up/Down = 0, then the circuit should behave as an up-counter. If 
Up/Down = 1, then the circuit should behave as a down-counter. 


Repeat Problem 5.15 using D flip-flops. 


The circuit in Figure P5.3 looks like a counter. What is the counting sequence of this circuit? 


Figure P5.3 The circuit for Problem 5.17. 


5.18 Consider the circuit in Figure P5.4. How does this circuit compare with the circuit in Figure 


5.16? Can the circuits be used for the same purposes? If not, what is the key difference 


between them? 


Figure P5.4 — Circuit for Problem 5.18. 
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Construct a NOR-gate circuit, similar to the one in Figure 5.11a, which implements a 
negative-edge-triggered D flip-flop. 


Write Verilog code that represents a modulo-12 up-counter with synchronous reset. 


For the flip-flops in the counter in Figure 5.24, assume that ¢,,, = 3 ns, t, = 1 ns, and the 
propagation delay through a flip-flop is 1 ns. Assume that each AND gate, XOR gate, and 
2-to-1 multiplexer has the propagation delay equal to 1 ns. What is the maximum clock 
frequency for which the circuit will operate correctly? 


Write Verilog code that represents an eight-bit Johnson counter. Synthesize the code with 
your CAD tools and give a timing simulation that shows the counting sequence. 


Write Verilog code in the style shown in Figure 5.51 that represents a ring counter. Your 
code should have a parameter n that sets the number of flip-flops in the counter. 


A ring oscillator is a circuit that has an odd number, n, of inverters connected in a ringlike 
structure, as shown in Figure P5.5. The output of each inverter is a periodic signal with a 
certain period. 


Figure P5.5 = A ring oscillator. 


(a) Assume that all the inverters are identical; hence they all have the same delay, called 
tp. Let the output of one of the inverters be named f. Give an equation that expresses the 
period of the signal f in terms of n and ¢,. 

(b) For this part you are to design a circuit that can be used to experimentally measure the 
delay ¢, through one of the inverters in the ring oscillator. Assume the existence of an input 
called Reset and another called Interval. The timing of these two signals is shown in Figure 
P5.6. The length of time for which /nterval has the value 1 is known. Assume that this 
length of time is 100 ns. Design a circuit that uses the Reset and Interval signals and the 
signal f from part (a) to experimentally measure ¢,. In your design you may use logic gates 
and subcircuits such as adders, flip-flops, counters, registers, and so on. 


Reset | | 
Interval | | 


<= _ 100 ns —W_ 


Figure P5.6 _ Timing of signals for Problem 5.24. 
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5.25  Acircuit for a gated D latch is shown in Figure P5.7. Assume that the propagation delay 
through either a NAND gate or an inverter is | ns. Complete the timing diagram given in 
the figure, which shows the signal values with | ns resolution. 


D 


Clock 


Le) 


0 
Do LUT LET PP LEE LEE TT] 
AO | 
eo UU 


Figure P5.7 — Circuit and timing diagram for Problem 5.25. 


*5.26 A logic circuit has two inputs, Clock and Start, and two outputs, f and g. The behavior of 
the circuit is described by the timing diagram in Figure P5.8. When a pulse is received 


Clock ! | | | | | | | | | | | | | | | | 
0 

Stat - 

a Gy _| he 
» —- | TPE] | ft } 
f 0 

a 2 | | 
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Figure P5.8 Timing diagram for Problem 5.26. 
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on the Start input, the circuit produces pulses on the f and g outputs as shown in the 
timing diagram. Design a suitable circuit using only the following components: a three- 
bit resettable positive-edge-triggered synchronous counter and basic logic gates. For your 
answer assume that the delays through all logic gates and the counter are negligible. 


5.27 The following code checks for adjacent ones in an n-bit vector. 


always @(A) 
begin 
f =A[1] & A[0]; 
for (k = 2;k <n; k =k+1) 
f=f | (A[k] & A[k—1}); 
end 


With blocking assignments this code produces the desired logic function, which is f = 
a\dy9 + -+++4y—-1A,_2. What logic function is produced if we change the code to use 
non-blocking assignments? 


5.28 The Verilog code in Figure P5.9 represents a 3-bit linear-feedback shift register (LFSR). 
This type of circuit generates a counting sequence of pseudo-random numbers that repeats 
after 2” — 1 clock cycles, where n is the number of flip-flops in the LFSR. Synthesize a 
circuit to implement the LFSR in a chip. Draw a diagram of the circuit. Simulate the 
circuit’s behavior by loading the pattern 001 into the LFSR and then enabling the register 
to count. What is the counting sequence? 


module Ifsr (R, L, Clock, Q); 
input [0:2] R; 
input L, Clock; 
output reg [0:2] Q; 


always @(posedge Clock) 
if (L) 
Q<=R; 
else 


Q <= {Q[2], Q[0] * Q[2], Q[1]}; 
endmodule 


Figure P5.9 = Code for a linear-feedback shift register. 
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5.29 Repeat Problem 5.28 for the Verilog code in Figure P5.10. 


module Ifsr (R, L, Clock, Q); 
input [0:2] R; 
input L, Clock; 
output reg [0:2] Q; 


always @(posedge Clock) 
if (L) 
Q<=R; 
else 


Q<= {QI2], QO], QL1] * Q(2]}; 
endmodule 


Figure P5.10 = Code for a linear-feedback shift register. 


5.30 The Verilog code in Figure P5.11 is equivalent to the code in Figure P5.9, except that 
blocking assignments are used. Draw the circuit represented by this code. What is its 
counting sequence? 


module lIfsr (R, L, Clock, Q); 
input [0:2] R; 
input L, Clock; 
output reg [0:2] Q; 


always @(posedge Clock) 
if (L) 
Q<=R; 
else 
begin 
QO] = Q(2]; 
QU1] = QI0] * QI2); 
Q[2] = QI1]; 


end 
endmodule 


Figure P5.11 = Code for Problem 5.30. 
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5.31. The Verilog code in Figure P5.12 is equivalent to the code in Figure P5.10, except that 
blocking assignments are used. Draw the circuit represented by this code. What is its 
counting sequence? 


module Ifsr (R, L, Clock, Q); 
input [0:2] R; 
input L, Clock; 
output reg [0:2] Q; 


always @(posedge Clock) 
if (L) 
Q<=R; 
else 
begin 
QO] = QI2]; 
Q[1] = Q[0); 
Q[2] = QI1] * QI2); 


end 


endmodule 


Figure P5.12 = Code for Problem 5.31. 


5.32 The circuit in Figure 5.59 gives a shift register in which the parallel-load control input 
is independent of the enable input. Show a different shift register circuit in which the 
parallel-load operation can be performed only when the enable input is also asserted. 
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SYNCHRONOUS SEQUENTIAL CIRCUITS 


CHAPTER OBJECTIVES 


In this chapter you will learn about: 


e Design techniques for circuits that use flip-flops 

e The concept of states and their implementation with flip-flops 

e Synchronous control by using a clock signal 

e Sequential behavior of digital circuits 

e Acomplete procedure for designing synchronous sequential circuits 
e Verilog specification of sequential circuits 


e The concept of finite state machines 
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In preceding chapters we considered combinational logic circuits in which outputs are determined fully by 
the present values of inputs. We also discussed how simple storage elements can be implemented in the form 
of flip-flops. The output of a flip-flop depends on the state of the flip-flop rather than the value of its inputs 
at any given time; the inputs cause changes in the state. 

In this chapter we deal with a general class of circuits in which the outputs depend on the past behavior 
of the circuit, as well as on the present values of inputs. They are called sequential circuits. In most cases 
a clock signal is used to control the operation of a sequential circuit; such a circuit is called a synchronous 
sequential circuit. The alternative, in which no clock signal is used, is called an asynchronous sequential 
circuit. Synchronous circuits are easier to design and are used in a vast majority of practical applications; 
they are the topic of this chapter. Asynchronous circuits will be discussed in Chapter 9. 

Synchronous sequential circuits are realized using combinational logic and one or more flip-flops. The 
general structure of such a circuit is shown in Figure 6.1. The circuit has a set of primary inputs, W, and 
produces a set of outputs, Z. The stored values in the flip-flops are referred to as the state, Q, of the circuit. 
Under control of the clock signal, the flip-flops change their state as determined by the combinational logic 
that feeds the inputs of these flip-flops. Thus the circuit moves from one state to another. To ensure that only 
one transition from one state to another takes place during one clock cycle, the flip-flops have to be of the 
edge-triggered type. They can be triggered either by the positive (0 to | transition) or by the negative (1 to 0 
transition) edge of the clock. We will use the term active clock edge to refer to the clock edge that causes the 
change in state. 

The combinational logic that provides the input signals to the flip-flops has two sources: the primary 
inputs, W, and the present (current) state of the flip-flops, Q. Thus changes in state depend on both the present 
state and the values of the primary inputs. 

Figure 6.1 indicates that the outputs of the sequential circuit are generated by another combinational 
circuit, such that the outputs are a function of the present state of the flip-flops and of the primary inputs. 
Although the outputs always depend on the present state, they do not necessarily have to depend directly on 
the primary inputs. Thus the connection shown in blue in the figure may or may not exist. To distinguish 
between these two possibilities, it is customary to say that sequential circuits whose outputs depend only on 
the state of the circuit are of Moore type, while those whose outputs depend on both the state and the primary 
inputs are of Mealy type. These names are in honor of Edward Moore and George Mealy, who investigated 
the behavior of such circuits in the 1950s. 


Combinational 
circuit 


Combinational 
circuit 


Clock 


Figure 6.1 ~~ The general form of a sequential circuit. 
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Sequential circuits are also called finite state machines (FSMs), which is a more formal name that is often 
found in technical literature. The name derives from the fact that the functional behavior of these circuits can 
be represented using a finite number of states. In this chapter we will often use the term finite state machine, 
or simply machine, when referring to sequential circuits. 


6.1. Basic DESIGN STEPS 


Sequential circuits are often used to control the operation of physical systems. We will 
introduce the techniques for designing such circuits by means of a simple example. 

Consider an application where the speed of an automatically-controlled vehicle has 
to be regulated as follows. The vehicle is designed to run at some predetermined speed. 
However, due to some operational conditions the speed may exceed the desirable limit, in 
which case the vehicle has to be slowed down. To determine when such action is needed, 
the speed is measured at regular intervals. Let a binary signal w indicate whether the speed 
exceeds the required limit, such that w = 0 means that the speed is within acceptable range 
and w = | indicates excessive speed. The desired control strategy is that if w = 1 during 
two or more consecutive measurements, a control signal z must be asserted to cause the 
vehicle to slow down. Thus, z = 0 allows the current speed to be maintained, while z = | 
reduces the speed. Let a signal Clock define the required timing intervals, such that the 
speed is measured once during each clock cycle. Therefore, we wish to design a circuit that 
meets the following specification: 


The circuit has one input, w, and one output, z. 
All changes in the circuit occur on the positive edge of the clock signal. 


3. The output z is equal to | if during two immediately preceding clock cycles the input 
w was equal to 1. Otherwise, the value of z is equal to 0. 


From this specification it is apparent that the output z depends on both present and past 
values of w. Consider the sequence of values of the w and z signals for the clock cycles 
shown in Figure 6.2. The values of w are assumed to be generated by the vehicle being 
controlled; the values of z correspond to our specification. Since w = | in clock cycle f 
and remains equal to | in cycle t4, then z should be set to | in cycle ts. Similarly, z should 
be set to 1 in cycle tg because w = | in cycles fg and f7, and it should also be set to | in fo 
because w = | in f, and fg. As seen in the figure, for a given value of input w the output z 
may be either 0 or 1. For example, w = 0 during clock cycles fz and ts, but z = 0 during 
and z = | during fs. Similarly, w = | during t, and fg, but z = 0 during f; and z = | during 
tg. This means that z is not determined only by the present value of w, so there must exist 
different states in the circuit that determine the value of z. 


6.1.1 STATE DIAGRAM 


The first step in designing a finite state machine is to determine how many states are needed 
and which transitions are possible from one state to another. There is no set procedure for 
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Clock cycle: to t; to t, 
w: O 1 O 1T 1 0 1 1 1 O 1 
z 0 0 0 0 0 1 0 0 1 =1 0 


Figure 6.2 = Sequences of input and output signals. 


this task. The designer must think carefully about what the machine has to accomplish. A 
good way to begin is to select one particular state as a starting state; this is the state that the 
circuit should enter when power is first turned on or when a reset signal is applied. For our 
example let us assume that the starting state is called state A. As long as the input w is 0, 
the circuit need not do anything, and so each active clock edge should result in the circuit 
remaining in state A. When w becomes equal to 1, the machine should recognize this, and 
move to a different state, which we will call state B. This transition takes place on the next 
active clock edge after w has become equal to 1. In state B, as in state A, the circuit should 
keep the value of output z at 0, because it has not yet seen w = | for two consecutive clock 
cycles. When in state B, if w is 0 at the next active clock edge, the circuit should move back 
to state A. However, if w = | when in state B, the circuit should change at the next active 
clock edge to a third state, called C, and it should then generate an output z = 1. The circuit 
should remain in state C as long as w = | and should continue to maintain z = 1. When 
w becomes 0, the machine should move back to state A. Since the preceding description 
handles all possible values of input w that the machine can encounter in its various states, 
we can conclude that three states are needed to implement the desired machine. 

Now that we have determined in an informal way the possible transitions between states, 
we will describe a more formal procedure that can be used to design the corresponding 
sequential circuit. Behavior of a sequential circuit can be described in several different 
ways. The conceptually simplest method is to use a pictorial representation in the form 
of a state diagram, which is a graph that depicts states of the circuit as nodes (circles) 
and transitions between states as directed arcs. The state diagram in Figure 6.3 defines the 
behavior that corresponds to our specification. States A, B, and C appear as nodes in the 
diagram. Node A represents the starting state, and it is also the state that the circuit will reach 
after an input w = 0 is applied. In this state the output z should be 0, which is indicated 
as A/z = 0 in the node. The circuit should remain in state A as long as w = 0, which is 
indicated by an arc with a label w = 0 that originates and terminates at this node. The first 
occurrence of w = | (following the condition w = 0) is recorded by moving from state A to 
state B. This transition is indicated on the graph by an arc originating at A and terminating 
at B. The label w = 1 on this arc denotes the input value that causes the transition. In state 
B the output remains at 0, which is indicated as B/z = 0 in the node. 

When the circuit is in state B, it will change to state C if w is still equal to 1 at the 
next active clock edge. In state C the output z becomes equal to 1. If w stays at 1 during 
subsequent clock cycles, the circuit will remain in state C maintaining z = 1. However, if 
w becomes 0 when the circuit is either in state B or in state C, the next active clock edge 
will cause a transition to state A to take place. 
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w= 1 


Figure 6.3 = State diagram of a simple sequential circuit. 


Present Next state Output 
state w=0 w=1 Zz 
A A B 0) 
B A C 0 
C A C 1 


Figure 6.4 _—_ State table corresponding to Figure 6.3. 


In the diagram we indicated that the Reset input is used to force the circuit into state 
A. We could treat Reset as just another input to the circuit, and show a transition from each 
state to the starting state A under control of the input Reset. This would complicate the 
diagram unnecessarily. Instead, we use a single arrow with the Reset label, as shown in 
Figure 6.3, to indicate that the Reset input causes a change to the starting state regardless 
of what state the circuit happens to be in. 


6.1.2 STATE TABLE 


Although the state diagram provides a description of the behavior of a sequential circuit 
that is easy to understand, to proceed with the implementation of the circuit it is conve- 
nient to translate the information contained in the state diagram into a tabular form. Figure 
6.4 shows the state table for our sequential circuit. The table indicates all transitions from 
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each present state to the next state for different values of the input signal. Note that the 
output z is specified with respect to the present state, namely, the state that the circuit is 
in at present time. Note also that we did not include the Reset input; instead, we made an 
implicit assumption that the first state in the table is the starting state. 

We now show the design steps that will produce the final circuit. To explain the basic 
design concepts, we first go through a traditional process of manually performing each 
design step. This is followed by a discussion of automated design techniques that use 
modern computer aided design (CAD) tools. 


6.1.3 STATE ASSIGNMENT 


The state table in Figure 6.4 defines the three states in terms of letters A, B, and C. When 
implemented in a logic circuit, each state is represented by a particular valuation (combi- 
nation of values) of state variables. Each state variable may be implemented in the form of 
a flip-flop. Since three states have to be realized, it is sufficient to use two state variables. 
Let these variables be y; and yp. 

Now we can adapt the general block diagram in Figure 6.1 to our example as shown in 
Figure 6.5, to indicate the structure of the circuit that implements the required finite state 
machine. Two flip-flops represent the state variables. In the figure we have not specified 
the type of flip-flops to be used; this issue is addressed in the next subsection. From the 
specification in Figures 6.3 and 6.4, the output z is determined only by the present state of 
the circuit. Thus the block diagram in Figure 6.5 shows that z is a function of only y; and 
y2; our design is of Moore type. We need to design a combinational circuit that uses y; and 
y2 as input signals and generates a correct output signal z for all possible valuations of these 
inputs. 


w 


Combinational Combinational 
circuit circuit 


Clock 


Figure 6.5 =A general sequential circuit with input w, output z, and two state flip-flops. 


6.1 Basic DESIGN STEPS 


Next state 
Present 

state w=0 wel Output 

Zz 

y2y1 Y2Y, YY, 

A 00 00 01 0) 

B 01 00 10 0 

C 10 00 10 1 

11 dd dd d 


Figure 6.6 —State-assigned table corresponding to Figure 6.4. 


The signals y; and y2 are also fed back to the combinational circuit that determines 
the next state of the FSM. This circuit also uses the primary input signal w. Its outputs are 
two signals, Y; and Y2, which are used to set the state of the flip-flops. Each active edge 
of the clock will cause the flip-flops to change their state according to the values of Y; and 
Y> at that time. Therefore, Y, and Y2 are called the next-state variables, and y; and y2 are 
called the present-state variables. We need to design a combinational circuit with inputs 
Ww, y1, and y2, such that for all valuations of these inputs the outputs Y; and Y2 will cause 
the machine to move to the next state that satisfies our specification. The next step in the 
design process is to create a truth table that defines this circuit, as well as the circuit that 
generates z. 

To produce the desired truth table, we assign a specific valuation of variables y, and y2 
to each state. One possible assignment is given in Figure 6.6, where the states A, B, and C 
are represented by y2y; = 00, 01, and 10, respectively. The fourth valuation, y2y; = 11, is 
not needed in this case. 

The type of table given in Figure 6.6 is usually called a state-assigned table. This table 
can serve directly as a truth table for the output z with the inputs y; and y2. Although for 
the next-state functions Y; and Y, the table does not have the appearance of a normal truth 
table, because there are two separate columns in the table for each value of w, it is obvious 
that the table includes all of the information that defines Y, and Y> in terms of valuations 
of inputs w, y;, and y2. 


6.1.4 CHOICE OF FLIP-FLOPS AND DERIVATION OF NEXT-STATE AND 
OUTPUT EXPRESSIONS 


From the state-assigned table in Figure 6.6, we can derive the logic expressions for the 
next-state and output functions. But first we have to decide on the type of flip-flops that 
will be used in the circuit. The most straightforward choice is to use D-type flip-flops, 
because in this case the values of Y; and Y> are simply clocked into the flip-flops to become 
the new values of y; and yz. In other words, if the inputs to the flip-flops are called D, 
and Dy», then these signals are the same as Y; and Y>. Note that the diagram in Figure 6.5 
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Yo 
Ignoring don’t cares Using don’t cares 
Y, =wyiy2 Y,; = wy1y2 
Y, = wy) Yot wy yo Y, = wy, +wy, 
= w(y, + Yo) 
Z = Wy B= 


Figure 6.7 _ Derivation of logic expressions for the table in Figure 6.6. 


corresponds exactly to this use of D-type flip-flops. For other types of flip-flops, such as 
JK type, the relationship between the next-state variable and inputs to a flip-flop is not as 
straightforward; we will consider this situation in Section 6.7. 

The required logic expressions can be derived as shown in Figure 6.7. We use Karnaugh 
maps to make it easy for the reader to verify the validity of the expressions. Recall that 
in Figure 6.6 we needed only three of the four possible binary valuations to represent the 
states. The fourth valuation, y2y; = 11, will not occur in the circuit because the circuit is 
constrained to move only within states A, B, and C; therefore, we may choose to treat this 
valuation as a don’t-care condition. The resulting don’t-care squares in the Karnaugh maps 
are denoted by d’s. Using the don’t cares to simplify the expressions, we obtain 


Y = wyYo 
= wore) 
z= y2 
If we do not use don’t cares, then the resulting expressions are slightly more complex; they 
are shown in the gray-shaded area of Figure 6.7. 


Since D; = Y, and Dz = 2, the logic circuit that corresponds to the preceding expres- 
sions is implemented as shown in Figure 6.8. Observe that a clock signal is included, and 
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Clock 


Resetn 


Figure 6.8 —_ Final implementation of the sequential circuit. 


the circuit is provided with an active-low reset capability. Connecting the clear input on 
the flip-flops to an external Resetn signal, as shown in the figure, provides a simple means 
for forcing the circuit into a known state. If we apply the signal Resetn = 0 to the circuit, 
then both flip-flops will be cleared to 0, placing the FSM into the state yoy; = 00. 


6.1.5 TiminG DIAGRAM 


To understand fully the operation of the circuit in Figure 6.8, let us consider its timing 
diagram presented in Figure 6.9. The diagram depicts the signal waveforms that correspond 
to the sequences of values in Figure 6.2. 

Because we are using positive-edge-triggered flip-flops, all changes in the signals occur 
shortly after the positive edge of the clock. The amount of delay from the clock edge depends 
on the propagation delays through the flip-flops. Note that the input signal w is also shown 
to change slightly after the active edge of the clock. This is a good assumption because in 
a typical digital system an input such as w would be just an output of another circuit that is 
synchronized by the same clock. We discuss the synchronization of input signals with the 
clock signal in Chapter 7. 
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Figure 6.9 Timing diagram for the circuit in Figure 6.8. 


A key point to observe is that even though w changes slightly after the active clock 
edge, and thus the value of w is equal to | (or 0) for almost the entire clock cycle, no change 
in the circuit will occur until the beginning of the next clock cycle when the positive edge 
causes the flip-flops to change their state. Thus the value of w must be equal to | for two 
clock cycles if the circuit is to reach state C and generate the output z = 1. 


6.1.6 SUMMARY OF DESIGN STEPS 


We can summarize the steps involved in designing a synchronous sequential circuit as 
follows: 


Obtain the specification of the desired circuit. 


2. Derive the states for the machine by first selecting a starting state. Then, given the 
specification of the circuit, consider all valuations of the inputs to the circuit and 
create new states as needed for the machine to respond to these inputs. To keep track 
of the states as they are visited, create a state diagram. When completed, the state 
diagram shows all states in the machine and gives the conditions under which the 
circuit moves from one state to another. 


3. Create a state table from the state diagram. Alternatively, it may be convenient to 
directly create the state table in step 2, rather than first creating a state diagram. 


4. In our sequential circuit example, there were only three states; hence it was a simple 
matter to create the state table that does not contain more states than necessary. 
However, in practice it is common to deal with circuits that have a large number of 
states. In such cases it is unlikely that the first attempt at deriving a state table will 
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produce optimal results, so that we may have more states than is really necessary. 
This can be corrected by a procedure that minimizes the number of states. We will 
discuss the process of state minimization in Section 6.6. 


5. Decide on the number of state variables needed to represent all states and perform the 
state assignment. There are many different state assignments possible for a given 
sequential circuit. Some assignments may be better than others. In the preceding 
example we used what seemed to be a natural state assignment. We will return to this 
example in Section 6.2 and show that a different assignment may lead to a simpler 
circuit. 


6. Choose the type of flip-flops to be used in the circuit. Derive the next-state logic 
expressions to control the inputs to all flip-flops and then derive logic expressions for 
the outputs of the circuit. So far we have used only D-type flip-flops. We will 
consider other types of flip-flops in Section 6.7. 


7. Implement the circuit as indicated by the logic expressions. 


We introduced the steps used in the design of sequential circuits by using an example of 
a simple controller for a vehicle. In this case the value of w reflected the speed of the vehicle. 
But, we could consider this example in a more general sense, in that the value of w in each 
clock cycle provides a sequence of symbols over time, where each symbol has the value 0 
or 1. Our circuit can detect sequences of symbols that consist of consecutive 1s, and indicate 
this by generating z = | following each occurrence of two consecutive 1s. Beacause the 
circuit detects a specific sequence of symbols, we can say that it acts as a sequence detector. 
Similar circuits can be designed to detect a variety of different sequences. 
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We have illustrated the design steps using a very simple sequential circuit. We will now 
consider a slightly more complex example that is closely tied to application in computer 
systems. 

A computer system usually contains a number of registers that hold data during various 
operations. Sometimes it is necessary to swap the contents of two registers. Typically, 
this is done by using a temporary location, which is usually a third register. For example, 
suppose that we want to swap the contents of registers R1 and R2. We can achieve this by 
first transferring the contents of R2 into the third register, say R3. Next, we transfer the 
contents of R1 into R2. Finally, we transfer the contents of R3 into R1. 

Registers in a computer system are connected via an interconnection network, as shown 
in Figure 6.10. In addition to the wires that connect to the network, each register has two 
control signals. The Rk,,; signal causes the contents of register Rk to be placed into the 
interconnection network. The Rk;, signal causes the data from the network to be loaded 
into Rk. The Rk,y, and Rk;, signals are generated by a control circuit, which is a finite state 
machine. For our example, we will design a control circuit that swaps the contents of R1 
and R2, in response to an input w = |. Therefore, the inputs to the control circuit will be 
w and Clock. The outputs will be Rlow, Rlin, R2ou, R2in, R3 our, R3in, and Done which 
indicates the completion of the required transfers. 


Example 6.1 
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Clock ———» Control Interconnection 


circuit network 


Done 


Figure 6.10 — System for Example 6.1. 


The desired swapping operation will be performed as follows. The contents of R2 are 
first loaded into R3, using the control signals R2,,,; = 1 and R3;, = 1. Then the contents of 
R1 are transferred into R2, using R1oy, = 1 and R2;, = 1. Finally, the contents of R3 (which 
are the previous contents of R2) are transferred into R1, using R3,,, = 1 and R1;, = 1. Since 
this step completes the required swap, we will set the signal Done = 1. Assume that the 
swapping is performed in response to a pulse on the input w, which has a duration of one 
clock cycle. Figure 6.11 gives a state diagram for a sequential circuit that generates the 
output control signals in the required sequence. Note that to keep the diagram simple, we 
have indicated the output signals only when they are equal to 1. In all other cases the output 
signals are equal to 0. 

In the starting state, A, no transfer is indicated, and all output signals are 0. The circuit 
remains in this state until a request to swap arrives in the form of w changing to |. In state 
B the signals required to transfer the contents of R2 into R3 are asserted. The next active 
clock edge places these contents into R3. This clock edge also causes the circuit to change 
to state C, regardless of whether w is equal to 0 or 1. In this state the signals for transferring 
R1 into R2 are asserted. The transfer takes place at the next active clock edge, and the 
circuit changes to state D regardless of the value of w. The final transfer, from R3 to R1, is 
performed on the clock edge that leaves state D, which also causes the circuit to return to 
state A. 

Figure 6.12 presents the same information in a state table. Since there are four states, 
we can use two state variables, y. and y,. A straightforward state assignment where the 
states A, B, C, and D are assigned the valuations yoy; = 00, 01, 10, and 11, respectively, 
leads to the state-assigned table in Figure 6.13. Using this assignment and D-type flip-flops, 
the next-state expressions can be derived as shown in Figure 6.14. They are 


Y,; =wy, + yiy2 
VY) = iyo + Y\y2 


6.1 Basic DESIGN STEPS 


= Reset 


D/R3,,, = 1, lig = 1, Done = 1 


Figure 6.11 — State diagram for Example 6.1. 


Present Next state Outputs 
state w=0 wel Rout Rlin R2out R2in R3 out R3in Done 
A A B 0 0 0 0 0 0 0 
B C C 0 0 1 0 0 1 0 
C D D 1 0 0 1 0 0 0 
D A A 0 1 0 0 1 0 1 


Figure 6.12 — State table for Example 6.1. 


The output control signals are derived as 


Rh out = R2in = Y1y2 


Rlin = R3ou = Done = yiy2 
R2out = R3 in = Yiy2 


These expressions lead to the circuit in Figure 6.15. 
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Present Next state 


state w=0 wel 


Outputs 


Yay YY, YY, Rlour Rin R2our R2in R3 our R3in Done 


A 00 00 01 0 0 0 0 0 0 0 
B 01 10 10 0 0 1 0 0 1 0 
Cc 10 11 11 1 0 0 1 0 0 0 
D 11 00 00 0 1 0 0 1 0 1 


Figure 6.13 — State-assigned table corresponding to Figure 6.12. 


Fy = Wit Vis 


Yy = yiYot Viy2 


Figure 6.14 _ Derivation of next-state expressions for Figure 6.13. 


| 6.2 STATE-ASSIGNMENT PROBLEM 


Having introduced the basic concepts involved in the design of sequential circuits, we should 
revisit some details where alternative choices are possible. In Section 6.1.6 we suggested 
that some state assignments may be better than others. To illustrate this we can reconsider 
the example in Figure 6.4. We already know that the state assignment in Figure 6.6 leads 
to a simple-looking circuit in Figure 6.8. But can the FSM of Figure 6.4 be implemented 
with an even simpler circuit by using a different state assignment? 

Figure 6.16 gives one possible alternative. In this case we represent the states A, B, 
and C with the valuations y2y, = 00, 01, and 11, respectively. The remaining valuation, 
y2y, = 10, is not needed, and we will treat it as a don’t-care condition. The next-state and 
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w R1,, 
R35 u1 
Clock Done 
Rou, 
R2i, 
R261 
R3i, 
Figure 6.15 Final implementation of the sequential circuit for Example 6.1. 
py Next state 
state |w=0 w= 1 | Output 
& 
y2y1 YoY; YoY 
A 00 00 ol 0 
B 01 00 11 0 
C 11 00 11 1 
10 dd dd d 
Figure 6.16 Improved state assignment for the state table in 
Figure 6.4. 


output expressions derived from the figure will be 


Y, =D, Ww 
Y, = D2 = wy, 


4 = Y2 
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Clock 


Resetn 


Figure 6.17 _ Final circuit for the improved state assignment in Figure 6.16. 


These expressions define the circuit shown in Figure 6.17. Comparing this circuit with the 
one in Figure 6.8, we see that the cost of the new circuit is lower because it requires fewer 
gates. 

In general, circuits are much larger than our example, and different state assignments 
can have a substantial effect on the cost of the final implementation. While highly desirable, 
it is often impossible to find the best state assignment for a large circuit. The exhaustive 
approach of trying all possible state assignments is not practical because the number of 
available state assignments is huge. CAD tools usually perform the state assignment using 


heuristic techniques. These techniques are usually proprietary, and their details are seldom 
published. 


Example 6.2 


In Figure 6.13 we used a straightforward state assignment for the sequential circuit in 
Figure 6.12. Consider now the effect of interchanging the valuations assigned to states C 
and D, as shown in Figure 6.18. Then the next-state expressions are 


Yi = wy2 + Yo 
Y=y1 
as derived in Figure 6.19. The output expressions are 


Rl out = R2in = yiy2 


Rlin = R3 out = Done = Y1y2 
R26u = R3in = Yiy2 


These expressions lead to a slightly simpler circuit than the one given in Figure 6.15. 


6.2 STATE-ASSIGNMENT PROBLEM 


Present Next state 


state |w=od wel Outputs 


Y2¥1 YY, YY, Rlour Rin R2our R2in R3our R3in Done 


A 00 00 01 0 0 0 0 0 0 0 
B 01 11 11 0 0 1 0 0 1 0 
C 11 10 10 1 0 0 1 0 0 0 
D 10 00 00 0 1 0 0 1 0 1 


Figure 6.18 Improved state assignment for the state table in Figure 6.12. 


Y, = wyotyYo 


Figure 6.19 _ Derivation of next-state expressions for Figure 6.18. 


6.2.1 ONE-HotT ENCODING 


In previous examples we used the minimum number of flip-flops to represent the states of 
the FSM. Another interesting possibility is to use as many state variables as there are states. 
In this method, for each state all but one of the state variables are equal to 0. The variable 
whose value is 1 is deemed to be “hot.” The approach is known as the one-hot encoding 
method. 

Figure 6.20 shows how one-hot state assignment can be applied to the sequential circuit 
of Figure 6.4. Because there are three states, it is necessary to use three state variables. The 
chosen assignment is to represent the states A, B, and C using the valuations y3y2y; = 001, 
010, and 100, respectively. The remaining five valuations of the state variables are not used. 
They can be treated as don’t cares in the derivation of the next-state and output expressions. 
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Present Next state 
state wed we 1 | Gutput 


Y3Yo¥1 | Y3¥Yo¥1 = Y3¥2Y 


A| O01 001 010 0 
B/|} 010 001 100 
Cc; 100 001 100 


—- © 


Figure 6.20 = One-hot state assignment for the state table in 
Figure 6.4. 


Using this assignment, the resulting expressions are 


Y,;=w 
Yo =wyi 
Y3 = wy, 
l= Y3 


Note that none of the next-state variables depends on the present-state variable y2. This 
suggests that the second flip-flop and the expression Y2 = wy, are not needed. (CAD tools 
detect and eliminate such redundancies!) But even then, the derived expressions are not 
simpler than those obtained using the state assignment in Figure 6.16. Although in this case 
the one-hot assignment is not advantageous, there are many cases where this approach is 
attractive. 


Example 6.3 


The one-hot state assignment can be applied to the sequential circuit of Figure 6.12 as 
indicated in Figure 6.21. Four state variables are needed, and the states A, B, C, and D are 
encoded as y4y3y2y; = 0001, 0010, 0100, and 1000, respectively. Treating the remaining 
12 valuations of the state variables as don’t cares, the next-state expressions are 


Y; = wy + y4 
Y = wy 
Y3 = yo 
Y4 = y3 


It is instructive to note that we can derive these expressions simply by inspecting the state 
diagram in Figure 6.11. There are two arcs leading to state A (not including the arc for the 
Reset signal). These arcs mean that flip-flop y; should be set to | if the FSM is already in 
state A and w = 0, or if the FSM is in state D; hence Y; = wy, + y4. Similarly, flip-flop y2 
should be set to 1 if the present state is A and w = 1; hence Y2 = wy,. Flip-flops y3 and y4 
should be set to | if the FSM is presently in state B or C, respectively; hence Y3 = yz and 
Y4 = y3. 


6.3. MEALY STATE MODEL 


Present Next state 
State w=0 wed 
Y4¥3¥2¥, | Ya¥3¥2Y1 YaY3¥o¥, | Rout Rlin R2our R2in R3our R3in Done 


0001 0001 0010 0 0 0 0 0 0 0 
0010 0100 0100 0 0 
0100 1000 1000 1 0 
1000 0001 0001 0 1 


Outputs 


VaAwW> 
oOOr 


0 0 1 0 
1 0 0 0 
0 1 0 1 


Figure 6.21 One-hot state assignment for the state table in Figure 6.12. 


The output expressions are just the outputs of the flip-flops, such that 
Rl our = R2in = ¥3 
Rlin = R3 ou = Done = y4 
Rou = R3in — y2 


These expressions are simpler than those derived in Example 6.2, but four flip-flops are 
needed, rather than two. 

An important feature of the one-hot state assignment is that it often leads to simpler 
output expressions than do assignments with the minimal number of state variables. Simpler 
output expressions may lead to a faster circuit. For instance, if the outputs of the sequential 
circuit are just the outputs of the flip-flops, as is the case in our example, then these output 
signals are valid as soon as the flip-flops change their states. If more complex output 
expressions are involved, then the propagation delay through the gates that implement 
these expressions must be taken into account. We will consider this issue in Section 6.8.2. 


349 


The examples considered to this point show that there are many ways to implement a 
given finite state machine as a sequential circuit. Each implementation is likely to have a 
different cost and different timing characteristics. In the next section we introduce another 
way of modeling FSMs that leads to even more possibilities. 


6.3 MEALY STATE MODEL 


Our introductory examples were sequential circuits in which each state had specific values 
of the output signals associated with it. As we explained at the beginning of the chapter, 
such finite state machines are said to be of Moore type. We will now explore the concept 
of Mealy-type machines in which the output values are generated based on both the state 
of the circuit and the present values of its inputs. This provides additional flexibility in the 
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design of sequential circuits. We will introduce the Mealy-type machines, using a slightly 
altered version of a previous example. 

The essence of the first sequential circuit in Section 6.1 is to generate an output z = 1 
whenever a second occurrence of the input w = | is detected in consecutive clock cycles. 
The specification requires that the output z be equal to 1 in the clock cycle that follows the 
detection of the second occurrence of w = 1. Suppose now that we eliminate this latter 
requirement and specify instead that the output z should be equal to | in the same clock cycle 
when the second occurrence of w = | is detected. Then a suitable input-output sequence 
may be as shown in Figure 6.22. To see how we can realize the behavior given in this table, 
we begin by selecting a starting state, A. As long as w = 0, the machine should remain 
in state A, producing an output z = 0. When w = 1, the machine has to move to a new 
state, B, to record the fact that an input of 1 has occurred. If w remains equal to | when the 
machine is in state B, which happens if w = | for at least two consecutive clock cycles, the 
machine should remain in state B and produce an output z = 1. As soon as w becomes 0, 
z should immediately become 0 and the machine should move back to state A at the next 
active edge of the clock. Thus the behavior specified in Figure 6.22 can be achieved with 
a two-state machine, which has a state diagram shown in Figure 6.23. Only two states are 
needed because we have allowed the output value to depend on the present value of the 
input as well as the present state of the machine. The diagram indicates that if the machine 
is in state A, it will remain in state A if w = 0 and the output will be 0. This is indicated by an 
arc with the label w = 0/z = 0. When w becomes 1, the output stays at 0 until the machine 
moves to state B at the next active clock edge. This is denoted by the arc from A to B with 
the label w = 1/z = 0. Instate B the output will be 1 if w = 1, and the machine will remain 
in state B, as indicated by the label w = 1/z = 1 on the corresponding arc. However, if 
w = 0 in state B, then the output will be 0 and a transition to state A will take place at the 


Clock cycle: to to ts ts 
w: O 1 0 1 1 0 1 21 21 ~=~0 1 
zz: O 0 0 


Figure 6.22 — Sequences of input and output signals. 


w=0/z=0 oe w=l1/z=1 


w=0/z=0 


Figure 6.23 — State diagram of an FSM that realizes the task in Figure 6.22. 


6.3. MEALY STATE MODEL 


next active clock edge. A key point to understand is that during the present clock cycle the 
output value corresponds to the label on the arc emanating from the present-state node. 

We can implement the FSM in Figure 6.23, using the same design steps as in Sec- 
tion 6.1. The state table is shown in Figure 6.24. The table shows that the output z depends 
on the present value of input w and not just on the present state. Figure 6.25 gives the 
state-assigned table. Because there are only two states, it is sufficient to use a single state 
variable, y. Assuming that y is realized as a D-type flip-flop, the required next-state and 
output expressions are 


Y=D=w 
Z=wy 


The resulting circuit is presented in Figure 6.26 along with a timing diagram. The timing 
diagram corresponds to the input-output sequences in Figure 6.22. 

The greater flexibility of Mealy-type FSMs often leads to simpler circuit realizations. 
This certainly seems to be the case in our examples that produced the circuits in Figures 6.8, 
6.17, and 6.26, assuming that the design requirement is only to detect two consecutive 
occurrences of input w being equal to 1. We should note, however, that the circuit in 
Figure 6.26 is not the same in terms of output behavior as the circuits in Figures 6.8 
and 6.17. The difference is a shift of one clock cycle in the output signal in Figure 6.26b. If 
we wanted to produce exactly the same output behavior using the Mealy approach, we could 
modify the circuit in Figure 6.26a by adding another flip-flop as shown in Figure 6.27. This 
flip-flop merely delays the output signal, Z, by one clock cycle with respect to z, as indicated 


Present Next state Output z 
state |w=0 w=l1|/w=0 w=1 
A A B 0 0 
B A B 0 1 


Figure 6.24 = State table for the FSM in Figure 6.23. 


Present Next state Output 
state |w=0 w=1|/w=0 we 
Y Y Zz Zz 
A 0 1 0 0 
B 1 0 1 0 1 


Figure 6.25 


State-assigned table for the FSM in Figure 6.24. 
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w 
Clock 
Resetn 
a) ty ty ts t4 ts tg ty tg tg ho 
1 
0 
1 
0 
1 
0 
1 
0 


(b) Timing diagram 


Figure 6.26 Implementation of the FSM in Figure 6.25. 


in the timing diagram. By making this change, we effectively turn the Mealy-type circuit 
into a Moore-type circuit with output Z. Note that the circuit in Figure 6.27 is essentially 
the same as the circuit in Figure 6.17. 


Example 6.4 


In Example 6.1 we considered the control circuit needed to swap the contents of two regis- 
ters, implemented as a Moore-type finite state machine. The same task can be achieved 
using a Mealy-type FSM, as indicated in Figure 6.28. State A still serves as the reset state. 
But as soon as w changes from 0 to 1, the output control signals R2,,,, and R3;, are asserted. 
They remain asserted until the beginning of the next clock cycle, when the circuit will leave 
state A and change to B. In state B the outputs R1,,, and R2;, are asserted for both w = 0 
and w = 1. Finally, in state C the swap is completed by asserting R3,,; and R1in. 

The Mealy-type realization of the control circuit requires three states. This does not 
necessarily imply a simpler circuit because two flip-flops are still needed to implement 
the state variables. The most important difference in comparison with the Moore-type 
realization is the timing of output signals. A circuit that implements the FSM in Figure 6.28 


6.3. MEALY STATE MODEL 


tg ty ty ty ty ts 1 ty tg i) ho 

1 

Clock 
a 3G 
i 1 
0 
, 1 
0) 
1 

z 
0 
Zz 1 
0 


(b) Timing diagram 


Figure 6.27 — Circuit that implements the specification in Figure 6.2. 


generates the output control signals one clock cycle sooner than the circuits derived in 
Examples 6.1 and 6.2. 

Note also that using the FSM in Figure 6.28, the entire process of swapping the contents 
of R1 and R2 takes three clock cycles, starting and finishing in state A. Using the Moore-type 
FSM in Example 6.1, the swapping process involves four clock cycles before the circuit 
returns to state A. 

Suppose that we wish to implement this FSM using one-hot encoding. Then three 
flip-flops are needed, and the states A, B, and C may be assigned the valuations y3y2y; = 
001, 010, and 100, respectively. Examining the state diagram in Figure 6.28, we can de- 
rive the next-state equations by inspection. The input to flip-flop y; should have the value 
1 if the FSM is in state A and w = 0 or if the FSM is in state C; hence Y; = wy, + y3. 
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0 
1 | R3our = ts Blin = 1, Done = | 


out 


Figure 6.28 — State diagram for Example 6.4. 


Flip-flop y2 should be set to 1 if the FSM is in state A and w = 1; hence Y2 = wy. Flip-flop 
y3 should be set to | if the present state is B; hence Y3 = y2. The derivation of the out- 
put expressions, which we leave as an exercise for the reader, can also be done by inspection. 


The preceding discussion deals with the basic principles involved in the design of 
sequential circuits. Although it is essential to understand these principles, the manual 
approach used in the examples is difficult and tedious when large circuits are involved. We 
will now show how CAD tools are used to greatly simplify the design task. 


6.4 DESIGNOF FINITE STATE MACHINES UsING CAD TooLs 


Sophisticated CAD tools are available for finite state machine design, and we introduce 
them in this section. A rudimentary way of using CAD tools for FSM design could be 
as follows: The designer employs the manual techniques described previously to derive a 
circuit that contains flip-flops and logic gates from a state diagram. This circuit is entered 
into the CAD system by drawing a schematic diagram or by writing structural hardware 
description language (HDL) code. The designer then uses the CAD system to simulate the 
behavior of the circuit and uses the CAD tools to automatically implement the circuit in a 
chip, such as a PLD. 

It is tedious to manually synthesize a circuit from a state diagram. Since CAD tools 
are meant to obviate this type of task, more attractive ways of utilizing CAD tools for FSM 
design have been developed. A better approach is to directly enter the state diagram into the 
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CAD system and perform the entire synthesis process automatically. CAD tools support 
this approach in two main ways. One method is to allow the designer to draw the state 
diagram using a graphical tool similar to the schematic capture tool. The designer draws 
circles to represent states and arcs to represent state transitions and indicates the outputs 
that the machine should generate. Another and more popular approach is to write HDL 
code that represents the state diagram, as described below. 

Many HDLs provide constructs that allow the designer to represent a state diagram. 
To show how this is done, we will provide Verilog code that represents the simple machine 
designed manually as the first example in Section 6.1. Then we will use the CAD tools to 
synthesize a circuit that implements the machine in a chip. 


6.4.1 VERILOG CODE FOR MoorRE-TyYPE FSMs 


Verilog does not define a standard way of describing a finite state machine. Hence while 
adhering to the required Verilog syntax, there is more than one way to describe a given 
FSM. An example of Verilog code for the FSM of Figure 6.3 is given in Figure 6.29. The 
code reflects directly the FSM structure in Figure 6.5. The module simple has inputs Clock, 
Resetn, and w, and output z. Two-bit vectors y and Y represent the present and the next 
state of the machine, respectively. The parameter statement associates the present-state 
valuations used in Figure 6.6 with the symbolic names A, B, and C. 

The state transitions are specified by two separate always blocks. The first block 
describes the required combinational circuit. It uses a case statement to give the value of 
the next state Y for each value of the present state y. Each case alternative corresponds to 
a present state of the machine, and the associated if-else statement specifies the next state 
to be reached according to the value of w. This portion of the code corresponds to the 
combinational circuit on the left side of Figure 6.5. 

Since there are only three states in our FSM, the case statement in Figure 6.29 includes 
a default clause that covers the unused valuation y = 11. This clause specifies Y = 2’bxx, 
which indicates to the Verilog compiler that the value of Y can be treated as a don’t-care 
when y = 11. It is important to consider the effect on the compiler if we did not include 
the default clause. In that case, the value of Y would not be specified at all for y = 11. 
As explained in Chapter 5, the Verilog compiler assumes that a signal’s value must be kept 
unchanged whenever the value is not specified by an assignment, and a memory element 
should be synthesized. For the code in Figure 6.29 this means that a latch will be generated 
in the combinational circuit for Y if the default clause is not included. In general, it is 
always necessary to include a default clause when using a case statement that does not 
cover all of its alternatives. 

The second always block introduces flip-flops into the circuit. Its sensitivity list com- 
prises the reset and clock signals. Asynchronous reset is performed when the Resetn input 
goes to 0, placing the FSM into state A. The else clause stipulates that after each positive 
clock edge the y signal should take the value of the Y signal, thus implementing the state 
changes. 
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module simple (Clock, Resetn, w, z); 
input Clock, Resetn, w; 
output z; 
reg [2:1] y, Y; 
parameter [2:1] A =2’b00, B = 2’b01, C = 2’b10; 


// Define the next state combinational circuit 

always @(w, y) 
case (y) 

A: if(w) Y=B 

else Y=A 

B: if(w) Y=C 

else Y=A; 

C: if(w) Y=C 

else Y=A 
default: Y 

endcase 


// Define the sequential block 

always @(negedge Resetn, posedge Clock) 
if (Resetn ==0) y<=A; 
else y<=Y; 


// Define output 
assign z = (y ==C); 


endmodule 


Figure 6.29 Verilog code for the FSM in Figure 6.3. 


This is a Moore-type FSM in which the output z is equal to | only in state C. The output 
is conveniently defined in a conditional assignment statement that sets z = 1 ify = C. This 


realizes the combinational circuit on the right side of Figure 6.5. 


6.4.2 SYNTHESIS OF VERILOG CODE 


To give an example of the circuit produced by a synthesis tool, we synthesised the code in 
Figure 6.29 for implementation in a CPLD, such as those described in Appendix B. The 
synthesis resulted in two flip-flops, with inputs Y; and Y2, and outputs y; and yz. The 


next-state expressions generated by the synthesis tool are 
Y| = wy V2 


Y, = wy, + wy2 
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The output expression is 
Z=y2 


These expressions correspond to the case in Figure 6.7 when the unused state pattern 
yoy, = 11 is treated as don’t-cares in the Karnaugh maps for Y,, Y2, and z. 

Figure 6.30 depicts a part of the FSM circuit implemented in a CPLD. To keep the 
figure simple, only the logic resources used for the two macrocells that implement y,, yo, 
and z are shown. The parts of the macrocells used for the circuit are highlighted in blue. 

The w input to the circuit is shown connected to one of the interconnection wires in 
the CPLD. The source node in the chip that generates w is not shown. It could be either an 
input pin, or else w might be the output of another macrocell, assuming that the CPLD may 


Interconnection wires 


PAL-like block 
v/v y ss 
D> | [ve | 
y 4D 


eee 
So 


(Other macrocells are not shown) 


Figure 6.30 —_ Implementation of the FSM of Figure 6.3 in a CPLD. 
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Figure 6.31 The circuit from Figure 6.30 in a small CPLD. 


contain other circuitry that is connected to our FSM. The Clock signal is assigned to a pin 
on the chip that is dedicated for use by clock signals. From this dedicated pin a global wire 
distributes the clock signal to all of the flip-flops in the chip. The global wire distributes 
the clock signal to the flip-flops such that the difference in the arrival time, or clock skew, 
of the clock signal at each flip-flop is minimized. The concept of clock skew is discussed 
in Chapter 5. A global wire is also used for the reset signal. 

Figure 6.31 illustrates how the circuit might be assigned to the pins on a small CPLD. 
The figure is drawn with a part of the top of the chip package cut away, revealing a conceptual 
view of the two macrocells from Figure 6.30, which are indicated in blue. Our simple circuit 
uses only a small portion of the device. 


6.4.3 SIMULATING AND TESTING THE CIRCUIT 


The behavior of the circuit implemented in the CPLD chip can be tested using timing 
simulation, as depicted in Figure 6.32. The figure gives the waveforms that correspond to 
the timing diagram in Figure 6.9, assuming that a 100 ns clock period is used. The Resetn 
signal is set to 0 at the beginning of the simulation and then set to 1. The circuit produces 
the output z = 1 for one clock cycle after w has been equal to 1 for two successive clock 
cycles. When w is | for three clock cycles, z becomes | for two clock cycles, as it should 
be. We show the changes in state by using the letters A, B, and C for readability purposes. 
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250.0ns 500.0ns 750.0ns 


Figure 6.32 — Simulation results for the circuit in Figure 6.30. 


Having examined the simulation output, we should consider the question of whether 
we can conclude that the circuit functions correctly and satisfies all requirements. For our 
simple example it is not difficult to answer this question because the circuit has only one 
input and its behavior is straightforward. It is easy to see that the circuit works properly. 
However, in general it is difficult to ascertain with a high degree of confidence whether a 
sequential circuit will work properly for all possible input sequences, because a very large 
number of input patterns may be possible. For large finite state machines, the designer must 
think carefully about patterns of inputs that may be used in simulation for testing purposes. 


6.4.4 ALTERNATIVE STYLES OF VERILOG CODE 


We mentioned earlier in this section that Verilog does not specify a standard way for writing 
code that represents a finite state machine. The code given in Figure 6.29 is only one pos- 
sibility. A slightly different version of code for our simple machine is given in Figure 6.33. 
In this case, we specified the output z inside the always block that defines the required 
combinational circuit. The effect is the same as in Figure 6.29. 

A different approach is taken in Figure 6.34. A single always block is used. The states 
are represented by a two-bit vector y. The required state transitions are given in the always 
block with the sensitivity list that comprises the reset and clock signals. Asynchronous reset 
is specified when Resetn goes to 0. Other transitions are defined in the case statement to 
correspond directly to those in Figure 6.3. The default clause indicates that the valuation 
y = yoy; = 11 can be treated as a don’t-care condition. 

The assignment statement that defines z is placed outside the always block. This 
assignment cannot be made inside the always block as done in Figure 6.33. Doing so 
would infer a separate flip-flop for z, which would delay the changes in z with respect to y2 
by one clock cycle. 

We have shown three styles of Verilog code for our FSM example. The circuit produced 
by the Verilog compiler for each version of the code may be somewhat different because, 
as the reader is well aware by this point, there are many ways to implement a given logic 
function. However, the circuits produced from the three versions of the code provide 
identical functionality. 
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module simple (Clock, Resetn, w, z); 
input Clock, Resetn, w; 
output reg z; 
reg [2:1] y, Y; 
parameter [2:1] A= 2’b00, B = 2’b01, C = 2’b10; 


// Define the next state and output combinational circuits 
always @(w, y) 
begin 
case (y) 
A: if(w) Y=B 
else Y=A 
B: if(w) Y=C 
else Y=A; 
C: if(w) Y=C 
else Y=A 
default: Y 
endcase 
z=(y==C); // Define output 
end 


// Define the sequential block 

always @(negedge Resetn, posedge Clock) 
if (Resetn ==0) y<=A; 
else y<=Y; 


endmodule 


Figure 6.33 — Second version of code for the FSM in Figure 6.3. 


Example 6.5 


Figure 6.35 shows how the FSM in Figure 6.11 can be specified in Verilog using the style 
illustrated in Figure 6.29. This FSM has four states, which are encoded using all four 
possible valuations of the state variables, hence there is no need for a default clause in the 
case statement. 


6.4.5 SUMMARY OF DESIGN STEPS WHEN UsING CAD TooLs 


In Section 6.1.6 we summarized the design steps needed to derive sequential circuits man- 
ually. We have now seen that CAD tools can automatically perform much of the work. 
However, it is important to realize that the CAD tools have not replaced all manual steps. 
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module simple (Clock, Resetn, w, z); 
input Clock, Resetn, w; 
output z; 
reg [2:1] y; 
parameter [2:1] A =2’b00, B = 2’b01, C = 2’b10; 


// Define the sequential block 
always @(negedge Resetn, posedge Clock) 
if (Resetn ==0) y<=A; 


else 
case (y) 
A: if(w) y<=B; 
else y<=A; 
B: if(w) y<=C; 
else y<=A; 
C: if(w) y<=C; 
else y<=A; 
default: y <=2’bxx; 
endcase 
// Define output 


assign z = (y ==C); 
endmodule 


Figure 6.34 = Third version of code for the FSM in Figure 6.3. 


With reference to the list given in Section 6.1.6, the first two steps, in which the machine 
specification is obtained and a state diagram is derived, still have to be done manually. 
Given the state diagram information as input, the CAD tools then automatically perform 
the tasks needed to generate a circuit with logic gates and flip-flops. In addition to the 
design steps given in Section 6.1.6, we should add the testing and simulation stage. We 
will defer detailed discussion of this issue until Chapter 11. 


6.4.6 SPECIFYING THE STATE ASSIGNMENT IN VERILOG CODE 


In Section 6.2 we saw that the state assignment may have an impact on the complexity of 
the designed circuit. An obvious objective of the state-assignment process is to minimize 
the cost of implementation. The cost function that should be optimized may be simply the 
number of gates and flip-flops. But it could also be based on other considerations that may 
be representative of the type of chip used to implement the design. 
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module control (Clock, Resetn, w, Rlin, Rlout, R2in, R2out, R3in, R3out, Done); 
input Clock, Resetn, w; 
output Rlin, Rlout, R2in, R2out, R3in, R3out, Done; 
reg [2:1] y, Y; 
parameter [2:1] A =2’b00, B = 2’b01, C=2’b10, D=2’b11; 


// Define the next state combinational circuit 
always @(w, y) 
case (y) 

A: if(w) Y=B 

else Y=A 

B: Y=C: 

CG Y=D 

D: Y=A 
endcase 


// Define the sequential block 

always @(negedge Resetn, posedge Clock) 
if (Resetn ==0) y<=A; 
else y<=Y; 


// Define outputs 

assign R2out = (y == B); 
assign R3in = (y == B); 
assign Rlout = (y ==C); 
assign R2in = (y == C); 
assign R3out = (y == D); 
assign Rlin = (y == D); 
assign Done = (y == D); 


endmodule 


Figure 6.35 Verilog code for the FSM in Figure 6.11. 


A particular state assignment can be specified in Verilog code by means of a parameter 
statement as done in Figures 6.29 through 6.35. However, Verilog compilers usually have a 
capability to search for different assignments that may give better results. Most compilers 
are able to recognize a finite state machine when they encounter its specification in the 
code that conforms to typical styles, such as those used in this chapter. When the compiler 
detects an FSM, it can try to optimize its implementation by applying certain strategies 
such as looking for a better state assignment, attempting to use the one-hot encoding, and 
exploiting specific features of the target device. The user can either allow the compiler to 
use its FSM-handling capability, or surpress it in which case the compiler simply deals with 
the Verilog statements in the usual way. 
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6.4.7 SPECIFICATION OF MEALY FSMs USING VERILOG 


A Mealy-type FSM can be specified in a similar manner as a Moore-type FSM. Figure 6.36 
gives Verilog code for the FSM in Figure 6.23. The state transitions are described in the 
same way as in our first Verilog example in Figure 6.29. The variables y and Y represent 
the present and next states, which can have values A and B. Compared to the code in 
Figure 6.29, the main difference is the way in which the code for the output is written. 
In Figure 6.36 the output z is defined within the case statement that also defines the state 
transitions. When the FSM is in state A, z should be 0, but when in state B, z should take 
the value of w. Since the sensitivity list for the always block includes w, a change in w 
will immediately reflect itself in the value of z if the machine is in state B, which meets the 
requirements of the Mealy-type FSM. 

Implementing the FSM specified in Figure 6.36 in a CPLD chip yields the same equa- 
tions as we derived manually in Section 6.3. Simulation results for the synthesized circuit 
appear in Figure 6.37. The input waveform for w is the same as the one we used for the 
Moore-type machine in Figure 6.32. Our Mealy-type machine behaves correctly, with z 
becoming | just after the start of the second consecutive clock cycle in which w is 1. 

In the simulation results we have given in this section, all changes in the input w occur 
immediately following a positive clock edge. This is based on the assumption, stated in 
Section 6.1.5, that in a real circuit w would be synchronized with respect to the clock that 
controls the FSM. In Figure 6.38 we illustrate a problem that may arise if w does not meet 
this specification. In this case we have assumed that the changes in w take place at the 
negative edge of the clock, rather than at the positive edge when the FSM changes its state. 
The first pulse on the w input is 100 ns long. This should not cause the output z to become 
equal to 1. But the circuit does not behave in this manner. After the signal w becomes 
equal to 1, the first positive edge of the clock causes the FSM to change from state A to 
state B. As soon as the circuit reaches the state B, the w input is still equal to | for another 
50 ns, which causes z to go to 1. When w returns to 0, the z signal does likewise. Thus an 
erroneous 50-ns pulse is generated on the output z. 

We should pursue the consequences of this problem a little further. If z is used to drive 
another circuit that is not controlled by the same clock, then the extraneous pulse is likely 
to cause big problems. But if z is used as an input to a circuit (perhaps another FSM) that 
is controlled by the same clock, then the 50-ns pulse will be ignored by this circuit if z = 0 
before the next positive edge of the clock (accounting for the setup time). 
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We will now present another simple example that illustrates the complete design process. 
In Chapter 3 we discussed the addition of binary numbers in detail. We explained several 
schemes that can be used to add two n-bit numbers in parallel, ranging from carry-ripple 
to carry-lookahead adders. In these schemes the speed of the adder unit is an important 
design parameter. Fast adders are more complex and thus more expensive. If speed is not 
of great importance, then a cost-effective option is to use a serial adder, in which bits are 
added a pair at a time. 
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module mealy (Clock, Resetn, w, z); 
input Clock, Resetn, w; 
output reg z; 
reg y, Y; 
parameter A = 1’b0, B= 1’b1; 


// Define the next state and output combinational circuits 
always @(w, y) 


endcase 


// Define the sequential block 

always @(negedge Resetn, posedge Clock) 
if (Resetn ==0) y<=A; 
else y<=Y; 


endmodule 


Figure 6.36 Verilog code for the Mealy machine of Figure 6.23. 


6.5.1 MEALY-TyPE FSM For SERIAL ADDER 


Let A = ay—1d;)-2 +--+ do and B = b,_1by_-2--- bo be two unsigned numbers that have to be 
added to produce Sum = S,—1S,-2--++ So. Our task is to design a circuit that will perform 
serial addition, dealing with a pair of bits in one clock cycle. The process starts by adding 
bits do and bo. In the next clock cycle, bits a; and b; are added, including a possible 
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250.0ns 500.0ns 750.0ns 


Figure 6.37 — Simulation results for the Mealy machine. 


Figure 6.38 Potential problem with asynchronous inputs to a Mealy FSM. 


carry from the bit-position 0, and so on. Figure 6.39 shows a block diagram of a possible 
implementation. It includes three shift registers that are used to hold A, B, and Sum as the 
computation proceeds. Assuming that the input shift registers have parallel-load capability, 
as depicted in Figure 5.18, the addition task begins by loading the values of A and B into 
these registers. Then in each clock cycle, a pair of bits is added by the adder FSM, and 
at the end of the cycle the resulting sum bit is shifted into the Sum register. We will use 
positive-edge-triggered flip-flops in which case all changes take place soon after the positive 
edge of the clock, depending on the propagation delays within the various flip-flops. At this 
time the contents of all three shift registers are shifted to the right; this shifts the existing 
sum bit into Sum, and it presents the next pair of input bits a; and b; to the adder FSM. 
Now we are ready to design the required FSM. This cannot be a combinational circuit 
because different actions will have to be taken, depending on the value of the carry from the 
previous bit position. Hence two states are needed: let G and H denote the states where the 
carry-in values are 0 and 1, respectively. Figure 6.40 gives a suitable state diagram, defined 
as a Mealy model. The output value, s, depends on both the state and the present value of 
the inputs a and b. Each transition is labeled using the notation ab/s, which indicates the 
value of s for a given valuation ab. In state G the input valuation 00 will produce s = 0, 
and the FSM will remain in the same state. For input valuations 01 and 10, the output will 
be s = 1, and the FSM will remain in G. But for 11, s = 0 is generated, and the machine 
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Shift register 


Clock 


Figure 6.39 — Block diagram for the serial adder. 


Reset (ab/s) 


| 11/0 
00/0 01/0 
01/1 10/0 
10/1 11/1 
00/1 
G: carry-in = 0 
H: carry-in = 1 


Figure 6.40 — State diagram for the serial adder FSM. 


moves to state H. In state H valuations 01 and 10 cause s = 0, while 11 causes s = 1. In 
all three of these cases, the machine remains in H. However, when the valuation 00 occurs, 
the output of 1 is produced and a change into state G takes place. 

The corresponding state table is presented in Figure 6.41. A single flip-flop is needed 
to represent the two states. The state assignment can be done as indicated in Figure 6.42. 
This assignment leads to the following next-state and output equations 


Y =ab+ay+ by 
s=a@b@y 


Comparing these expressions with those for the full-adder in Section 3.2, it is obvious that 
y is the carry-in, Y is the carry-out, and s is the sum of the full-adder. Therefore, the adder 
FSM box in Figure 6.39 consists of the circuit shown in Figure 6.43. The flip-flop can be 
cleared by the Reset signal at the start of the addition operation. 

The serial adder is a simple circuit that can be used to add numbers of any length. The 
structure in Figure 6.39 is limited in length only by the size of the shift registers. 
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Present Next state Output s 
state |ab=00 O1 10 11/00 01 10 11 
G G GG H}]0 1 1 0 
H G H H Hy] 1 0 O 1 
Figure 6.41 — State table for the serial adder FSM. 
Next state Output 
Present 
state ab=00 O1 10 111;00 O1 10 I1 
y Y Ss 
0 0 oO O 1,0 1 1 0 
1 0 1 1 1 1 0 O 1 
Figure 6.42 —State-assigned table for Figure 6.41. 


Figure 6.43 


Reset 


6.5.2 Moore-Tyre FSM For SERIAL ADDER 


Circuit for the adder FSM in Figure 6.39. 


In the preceding example we saw that a Mealy-type FSM nicely meets the requirement 
for implementing the serial adder. Now we will try to achieve the same objective using a 
Moore-type FSM. A good starting point is the state diagram in Figure 6.40. In a Moore-type 
FSM, the output must depend only on the state of the machine. Since in both states, G and 
H, itis possible to produce two different outputs depending on the valuations of the inputs 
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a and b, a Moore-type FSM will need more than two states. We can derive a suitable state 
diagram by splitting both G and H into two states. Instead of G, we will use Go and G; to 
denote the fact that the carry is 0 and that the sum is either 0 or 1, respectively. Similarly, 
instead of H, we will use Ho and Hy. Then the information in Figure 6.40 can be mapped 
into the Moore-type state diagram in Figure 6.44 in a straightforward manner. 

The corresponding state table is given in Figure 6.45 and the state-assigned table in 
Figure 6.46. The next-state and output expressions are 


Yi, =a@b@y2 
Y, = ab+ ay2 + by2 
s=y)) 


The expressions for Y; and Y> correspond to the sum and carry-out expressions in the 
full-adder circuit. The FSM is implemented as shown in Figure 6.47. It is interesting to 
observe that this circuit is very similar to the circuit in Figure 6.43. The only difference is 
that in the Moore-type circuit, the output signal, s, is passed through an extra flip-flop and 
thus delayed by one clock cycle with respect to the Mealy-type sequential circuit. Recall 
that we observed the same difference in our previous example, as depicted in Figures 6.26 
and 6.27. 

A key difference between the Mealy and Moore types of FSMs is that in the former a 
change in inputs reflects itself immediately in the outputs, while in the latter the outputs do 
not change until the change in inputs forces the machine into a new state, which takes place 
one clock cycle later. We encourage the reader to draw the timing diagrams for the circuits 
in Figures 6.43 and 6.47, which will exemplify further this key difference between the two 
types of FSMs. 


Reset 


Figure 6.44 — State diagram for the Moore-type serial adder FSM. 
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Present Next state Output 
state | ab=00 O1 10 11 s 
Go Go Gi G, Ho 0 
G, Go Gi G, Ho 1 
Hy Gi, Hig Mg. Hl! (20 
Hy Gi Ho Ho Hy 1 


Figure 6.45 _—_ State table for the Moore-type serial adder FSM. 


Next state 
Present 
state |ab=00 O01 10 11 | Output 
y2Y1 YY, Ss 
00 00 O01 O1 10 0 
01 00 O01 O01 10 1 
10 01 10 10 11 0 
11 01 10 10 11 1 


Figure 6.46 = State-assigned table for Figure 6.45. 


Sum bit 


Clock 


Reset 


Figure 6.47 Circuit for the Moore-type serial adder FSM. 
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6.5.3 VERILOG CODE FOR THE SERIAL ADDER 


The serial adder can be described in Verilog by writing code for the shift registers and the 
adder FSM. We will first design the shift register and then use it as a subcircuit in the serial 
adder. 


Shift Register Subcircuit 

In the serial adder it is beneficial to have the ability to prevent the shift register contents 
from changing when an active clock edge occurs. Figure 6.48 gives the code for a shift 
register named shiftrne, which has an enable input, E. When E = 1, the contents of the 
register are shifted left-to-right on the positive edge of the clock. Setting E = 0 prevents 
the contents of the shift register from changing. 


Complete Code 

The code for the serial adder is shown in Figure 6.49. It instantiates three shift registers 
for the inputs A and B and the output Sum. The shift registers are loaded with parallel data 
when the circuit is reset. The state diagram for the adder FSM is described by two always 
blocks, using the style of code in Figure 6.36. In addition to the components of the serial 
adder shown in Figure 6.39, the Verilog code includes a down-counter to determine when 
the adder should be halted because all n bits of the required sum are present in the output 
shift register. When the circuit is reset, the counter is loaded with the number of bits in the 
serial adder, n. The counter counts down to 0, and then stops and disables further changes 
in the output shift register. 


module shiftrne (R, L, E, w, Clock, Q); 
parameter n= 8; 
input [n—1:0] R; 
input L, E, w, Clock; 
output reg [n—1:0] Q; 
integer k; 


always @(posedge Clock) 


if (L) 
Q<=R; 

else if (E) 

begin 


for (k =n—1;k > 0;k=k-1) 
Q{k- 1] <= QIk]; 
Q{In-1] <=w; 
end 


endmodule 


Figure 6.48 = Code for a left-to-right shift register with an 
enable input. 
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module serial_adder(A, B, Reset, Clock, Sum); 
input [7:0] A, B; 
input Reset, Clock; 
output wire [7:0] Sum; 
reg [3:0] Count; 
reg s, y, Y; 
wire [7:0] QA, QB; 
wire Run; 
parameter G= 1’b0,H= 1’b1; 


shiftrne shift_A (A, Reset, 1’b1, 1’b0, Clock, QA); 
shiftrne shift_B (B, Reset, 1’b1, 1’b0, Clock, QB); 
shiftrne shift_Sum (8’b0, Reset, Run, s, Clock, Sum); 


// Adder FSM 
// Output and next state combinational circuit 
always @(QA, QB, y) 
case (y) 
G: begin 
s = QA[0] * QB[O]; 
if (QA[0] & QB[O]) Y =H; 
else Y=G; 
end 
H: begin 
s = QA[0] ~ * QB[0]; 
if (~QA[0] &~ QB[0]) Y=G; 
else Y =H; 
end 
default: Y=G; 
endcase 


// Sequential block 
always @(posedge Clock) 
if (Reset) y<=G; 

else y<=Y; 


// Control the shifting process 
always @(posedge Clock) 

if (Reset) Count = 8; 

else if (Run) Count = Count — 1; 
assign Run =|Count; 


endmodule 


Figure 6.49 Verilog code for the serial adder. 
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The code in Figure 6.49 implements a serial adder for eight-bit numbers. The wires 
QA and QB correspond to the parallel outputs of the shift registers with inputs A and B in 
Figure 6.39. The variable s represents the output of the adder FSM. 

In Figure 6.39 the shift registers for inputs A and B do not use a serial input or an enable 
input. However, the shiftrne component, which is used for all three shift registers, includes 
these ports and so signals must be connected to them. The enable input for the two shift 
registers can be connected to logic value 1. The value shifted into the serial input does not 
matter, so it can be connected to either 1 or 0; we have chosen to connect it to 0. The shift 
registers are loaded in parallel by the Reset signal. We have chosen to use an active-high 
reset signal for the circuit. The output shift register does not need a parallel data input, so 
all Os are connected to this input. 

The first always block describes the state transitions and the output of the adder FSM 
in Figure 6.40. The output definition follows from observing in Figure 6.40 that when the 
FSM is in state G, the sum is s = a @ b, and when in state H, the sum is s=a@b. The 
second always block implements the flip-flop y and provides synchronous reset when Reset 
ae 

The enable input for the output shift register is named Run. It is derived from the 
outputs of the down-counter specified in the third always block. When Reset = 1, Count 
is initialized to the value 8. Then as long as Run = 1, Count is decremented in each clock 
cycle. Run is set to 0 when Count is equal to 0, which is detected by using the reduction 
OR operator. 


Synthesis and Simulation of the Verilog Code 


The results of synthesizing a circuit from the code in Figure 6.49 are illustrated in 
Figure 6.50a. The outputs of the counter are ORed to provide the Run signal, which 
enables clocking of both the output shift register and the counter. A sample of a timing 
simulation for the circuit is shown in Figure 6.50b. The circuit is first reset, resulting in the 
values of A and B being loaded into the input shift registers, and the value 8 loaded into the 
down-counter. After each clock cycle one pair of bits of the input numbers is added by the 
adder FSM, and the sum bit is shifted into the output shift register. After eight clock cycles 
the output shift register contains the correct sum, and shifting is halted by the Run signal 
becoming equal to 0. 


6.6 STATE MINIMIZATION 


Our introductory examples of finite state machines were so simple that it was easy to see 
that the number of states that we used was the minimum possible to perform the required 
function. When a designer has to design a more complex FSM, it is likely that the initial 
attempt will result in a machine that has more states than is actually required. Minimizing 
the number of states is of interest because fewer flips-flops may be needed to represent the 
states and the complexity of the combinational circuit needed in the FSM may be reduced. 
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(a) Circuit 
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(b) Simulation Results 


Figure 6.50 = Synthesized serial adder. 
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If the number of states in an FSM can be reduced, then some states in the original 
design must be equivalent to other states in their contribution to the overall behavior of the 
FSM. We can express this more formally in the following definition. 


Definition 6.1 — Two states S; and S; are said to be equivalent if and only if for every 
possible input sequence, the same output sequence will be produced regardless of whether 
S; or S; is the initial state. 


It is possible to define a minimization procedure that searches for any states that are equiv- 
alent. Such a procedure is very tedious to perform manually, but it can be automated for 
use in CAD tools. We will not pursue it here, because of its tediousness. However, to pro- 
vide some appreciation of the impact of state minimization, we will present an alternative 
approach, which is much more efficient but not quite as broad in scope. 

Instead of trying to show that some states in a given FSM are equivalent, it is often 
easier to show that some states are definitely not equivalent. This idea can be exploited to 
define a simple minimization procedure. 


6.6.1 PARTITIONING MINIMIZATION PROCEDURE 


Suppose that a state machine has a single input w. Then if the input signal w = 0 is applied 
to this machine in state S; and the result is that the machine moves to state S,,, we will say 
that S,, is a 0-successor of S;. Similarly, if w = | is applied in the state S; and it causes the 
machine to move to state S,, we will say that S, is a l-successor of S;. In general, we will 
refer to the successors of S; as its k-successors. When the FSM has only one input, k can 
be either 0 or 1. But if there are multiple inputs to the FSM, then k represents the set of all 
possible combinations (valuations) of the inputs. 

From Definition 6.1 it follows that if the states S$; and S; are equivalent, then their 
corresponding k-successors (for all &) are also equivalent. Using this fact, we can formulate 
a minimization procedure that involves considering the states of the machine as a set and 
then breaking the set into partitions that comprise subsets that are definitely not equivalent. 


Definition 6.2 — A partition consists of one or more blocks, where each block comprises 
a subset of states that may be equivalent, but the states in a given block are definitely not 
equivalent to the states in other blocks. 


Let us assume initially that all states are equivalent; this forms the initial partition, P,, 
in which all states are in the same block. As the next step, we will form the partition 
P, in which the set of states is partitioned into blocks such that the states in each block 
generate the same output values. Obviously, the states that generate different outputs cannot 
possibly be equivalent. Then we will continue to form new partitions by testing whether 
the k-successors of the states in each block are contained in one block. Those states whose 
k-successors are in different blocks cannot be in one block. Thus new blocks are formed 
in each new partition. The process ends when a new partition is the same as the previous 
partition. Then all states in any one block are equivalent. To illustrate the procedure, 
consider Example 6.6. 
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Figure 6.51 shows a state table for a particular FSM. In an attempt to minimize the number 
of states, let us apply the partitioning procedure. The initial partition contains all states in 
a single block 


P, = (ABCDEFG) 


The next partition separates the states that have different outputs (note that this FSM is of 
Moore type), which means that the states A, B, and D must be different from the states C, 
E, F, and G. Thus the new partition has two blocks 


P» = (ABD)(CEFG) 


Now we must consider all 0- and 1-successors of the states in each block. For the block 
(ABD), the 0-successors are (BDB), respectively. Since all of these successor states are in 
the same block in P2, we should still assume that the states A, B, and D may be equivalent. 
The 1-successors for these states are (CFG). Since these successors are also in the same 
block in P», we conclude that (ABD) should remain in one block of P3. Next consider the 
block (CEFG). Its 0-successors are (FFEF), respectively. They are in the same block in 
P,. The 1-successors are (ECDG). Since these states are not in the same block in Py, it 
means that at least one of the states in the block (CEFG) is not equivalent to the others. In 
particular, the state F must be different from the states C, E, and G because its 1-successor 
is D, which is in a different block than C, E, and G. Hence 


P3 = (ABD)(CEG)(F) 


Repeating the process yields the following. The 0-successors of (ABD) are (BDB), which 
are in the same block of P3. The 1-successors are (CFG), which are not in the same block. 
Since F is in a different block than C and G, it follows that the state B cannot be equivalent 
to states A and D. The 0- and 1-successors of (CEG) are (FFF) and (ECG), respectively. 
Both of these subsets are accommodated in the blocks of P3. Therefore 


P4 = (AD)(B)(CEG)(F) 


Present Next state Output 
state w=0 wel Zz 
A B Cc 1 
B D F 1 
C F E 0 
D B G 1 
E F C 0 
F E D 0 
G F G 0) 


Figure 6.51 — State table for Example 6.6. 


Example 6.6 
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Present Next state Output 
state w=0 wel Z 
A B C 1 
B A F 1 
C F C 0 
F C A 0 


Figure 6.52 — Minimized state table for Example 6.6. 


If we follow the same approach to check the 0- and 1-successors of the blocks (AD) and 
(CEG), we find that 


Ps = (AD)(B)(CEG)(F) 


Since Ps; = P, and no new blocks are generated, it follows that states in each block are 
equivalent. If the states in some block were not equivalent, then their k-successors would 
have to be in different blocks. Therefore, states A and D are equivalent, and C, E, and G 
are equivalent. Since each block can be represented by a single state, only four states are 
needed to implement the FSM defined by the state table in Figure 6.51. If we let the symbol 
A represent both the states A and D in the figure and the symbol C represent the states C, 
E, and G, then the state table reduces to the state table in Figure 6.52. 

The effect of the minimization is that we have found a solution that requires only two 
flip-flops to realize the four states of the minimized state table, instead of needing three 
flip-flops for the original design. The expectation is that the FSM with fewer states will be 
simpler to implement, although this is not always the case. 

The state minimization concept is based on the fact that two different FSMs may exhibit 
identical behavior in terms of the outputs produced in response to all possible inputs. Such 
machines are functionally equivalent, even though they are implemented with circuits that 
may be vastly different. In general, it is not easy to determine whether or not two arbitrary 
FSMs are equivalent. Our minimization procedure ensures that a simplified FSM is func- 
tionally equivalent to the original one. We encourage the reader to get an intuitive feeling 
that the FSMs in Figures 6.51 and 6.52 are indeed functionally equivalent by implementing 
both machines and simulating their behavior using the CAD tools. 


Example 6.7 


As another example of minimization, we will consider the design of a sequential circuit that 
could control a vending machine. Suppose that a coin-operated vending machine dispenses 
candy under the following conditions: 


¢ The machine accepts nickels and dimes. 


¢ It takes 15 cents for a piece of candy to be released from the machine. 


6.6 STATE MINIMIZATION 


SENSE y, 
SENSE 
N 
D 


(a) Timing diagram 


a 


SENSE n, 


Clock > > 


(b) Circuit that generates N 


Figure 6.53 = Signals for the vending machine. 


¢ If 20 cents is deposited, the machine will not return the change, but it will credit the 
buyer with 5 cents and wait for the buyer to make a second purchase. 


All electronic signals in the vending machine are synchronized to the positive edge of a 
clock signal, named Clock. The exact frequency of the clock signal is not important for our 
example, but we will assume a clock period of 100 ns. The vending machine’s coin-receptor 
mechanism generates two signals, sensep and sensey, which are asserted when a dime or 
a nickel is detected. Because the coin receptor is a mechanical device and thus very slow 
compared to an electronic circuit, inserting a coin causes sensep or sensey to be set to | fora 
large number of clock cycles. We will assume that the coin receptor also generates two other 
signals, named D and N. The D signal is set to 1 for one clock cycle after sensep becomes 
1, and N is set to 1 for one clock cycle after sensey becomes 1. The timing relationships 
between Clock, sensep, sensey, D, and N are illustrated in Figure 6.53a. The hash marks 
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Figure 6.54 — State diagram for Example 6.7. 


on the waveforms indicate that sensep or sensey may be | for many clock cycles. Also, 
there may be an arbitrarily long time between the insertion of two consecutive coins. Note 
that since the coin receptor can accept only one coin at a time, it is not possible to have both 
D and N set to | at once. Figure 6.53b illustrates how the N signal may be generated from 
the sensey signal. 

Based on these assumptions, we can develop an initial state diagram in a fairly straight- 
forward manner, as indicated in Figure 6.54. The inputs to the FSM are D and N, and the 
starting state is $1. As long as D=N =O, the machine remains in state $1, which is 
indicated by the arc labeled D - N = 1. Inserting a dime leads to state $2, while inserting a 
nickel leads to state S3. In both cases the deposited amount is less than 15 cents, which is 
not sufficient to release the candy. This is indicated by the output, z, being equal to 0, as in 
S2/0 and $3/0. The machine will remain in state $2 or $3 until another coin is deposited 
because D = N = 0. In state $2 a nickel will cause a transition to $4 and a dime to $5. 
In both of these states, sufficient money is deposited to activate the output mechanism that 
releases the candy; hence the state nodes have the labels $4/1 and $5/1. In $4 the deposited 
amount is 15 cents, which means that on the next active clock edge the machine should 
return to the reset state $1. The condition D - N on the arc leaving $4 is guaranteed to be 
true because the machine remains in state $4 for only 100 ns, which is far too short a time 
for a new coin to have been deposited. 


6.6 STATE MINIMIZATION 


Present Next state Output 
state | pDN=00 O01 10 11 z 
Sl Sl S3 S82 - 0 
$2 S2 S4 S5 - 0 
S3 S3 S6 S7 - 0 
S4 Sl o- = = 1 
S5 S30 - - - 1 
S6 S6 S8& S9 - 0 
S7 Sl o- - ca 1 
S8 Sl o- - = 1 
S9 S30 - = = 1 


Figure 6.55 — State table for Example 6.7. 


The state $5 denotes that an amount of 20 cents has been deposited. The candy 
is released, and on the next clock edge the FSM makes a transition to state $3, which 
represents a credit of 5 cents. A similar reasoning when the machine is in state $3 leads to 
states $6 through $9. This completes the state diagram for the desired FSM. A state table 
version of the same information is given in Figure 6.55. 

Note that the condition D = N = | is denoted as don’t care in the table. Note also 
other don’t cares in states $4, S5, S7, 58, and S9. They correspond to cases where there is 
no need to check the D and N signals because the machine changes to another state in an 
amount of time that is too short for a new coin to have been inserted. 

Using the minimization procedure, we obtain the following partitions 


P, = (S1, $2, S3, $4, $5, S6, $7, S8, S9) 
Py = (S1, $2, $3, $6)(S4, $5, S7, $8, S9) 
P3 = (S1)(S3)(S2, $6)(S4, $5, 57, $8, S9) 
P4 = (S1)(S3)(S2, $6)(S4, $7, $8)(S5, S9) 
Ps = (S1)(S3)(S2, $6)(S4, $7, $8)(S5, S9) 


The final partition has five blocks. Let $2 denote its equivalence to $6, let S4 denote the 
same with respect to S7 and $8, and let $5 represent $9. This leads to the minimized 
state table in Figure 6.56. The actual circuit that implements this table can be designed as 
explained in the previous sections. 

In this example we used a straightforward approach to derive the original state dia- 
gram, which we then minimized using the partitioning procedure. Figure 6.57 presents 
the information in the state table of Figure 6.56 in the form of a state diagram. Looking 
at this diagram, the reader can probably see that it may have been quite feasible to derive 
the optimized diagram directly, using the following reasoning. Suppose that the states cor- 
respond to the various amounts of money deposited. In particular, the states, $1, $3, $2, 
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Present Next state Output 
state | pN=00 O1 10 11 z 
Sl S1 $3 S2 — 0 
S2 S2 $4 S5 — 0 
S3 $3 $2 S4 -— 0 
S4 a 1 
S5 a 1 


Figure 6.56 — Minimized state table for Example 6.7. 


Figure 6.57 — Minimized state diagram for Example 6.7. 


S4, and S5 correspond to the amounts of 0, 5, 10, 15, and 20 cents, respectively. With 
this interpretation of the states, it is not difficult to derive the transition arcs that define the 
desired FSM. In practice, the designer can often produce initial designs that do not have a 
large number of superfluous states. 

We have found a solution that requires five states, which is the minimum number of 
states for a Moore-type FSM that realizes the desired vending control task. From Sec- 
tion 6.3 we know that Mealy-type FSMs may need fewer states than Moore-type machines, 


6.6 STATE MINIMIZATION 


DN /0 


N/0 D/\ 


N/0 D/\ 


DN/0 


Figure 6.58 = Mealy-type FSM for Example 6.7. 


although they do not necessarily lead to simpler overall implementations. If we use the 
Mealy model, we can eliminate states $4 and $5 in Figure 6.57. The result is shown in 
Figure 6.58. This version requires only three states, but the output functions become more 
complicated. The reader is encouraged to compare the complexity of implementations by 
completing the design steps for the FSMs in Figures 6.57 and 6.58. 
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6.6.2 INCOMPLETELY SPECIFIED FSMs 


The partitioning scheme for minimization of states works well when all entries in the state 
table are specified. Such is the case for the FSM defined in Figure 6.51. FSMs of this 
type are said to be completely specified. If one or more entries in the state table are not 
specified, corresponding to don’t-care conditions, then the FSM is said to be incompletely 
specified. An example of such an FSM is given in Figure 6.55. As seen in Example 6.7, the 
partitioning scheme works well for this FSM also. But in general, the partitioning scheme is 
less useful when incompletely specified FSMs are involved, as illustrated by Example 6.8. 


Consider the FSM in Figure 6.59 which has four unspecified entries, because we have as- 
sumed that the input w = 1 will not occur when the machine is in states B or G. Accordingly, 


Example 6.8 
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Ppctant Next state Output z 
state w=0 we=1l]w=0 wel 
A B C 0 0 
B D = 0 _ 

C F E 0 1 
D B G 0 0 
E F C 0 1 
F E D 0 1 
G F = 0 if 


Figure 6.59 —_Incompletely specified state table for Example 6.8. 


neither a state transition nor an output value is specified for these two cases. An important 
difference between this FSM and the one in Figure 6.55 is that some outputs in this FSM 
are unspecified, whereas in the other FSM all outputs are specified. 

The partitioning minimization procedure can be applied to Mealy-type FSMs in the 
same way as for Moore-type FSMs illustrated in Examples 6.6 and 6.7. Two states are 
considered equivalent, and are thus placed in the same block of a partition, if their outputs 
are equal for all corresponding input valuations. To perform the partitioning process, we 
can assume that the unspecified outputs have a specific value. Not knowing whether these 
values should be 0 or 1, let us first assume that both unspecified outputs have a value of 0. 
Then the first two partitions are 


P, = (ABCDEFG) 
P) = (ABDG)(CEF) 


Note that the states A, B, D, and G are in the same block because their outputs are equal to 0 
for both w = 0 and w = 1. Also, the states C, FE, and F are in one block because they have 
the same output behavior; they all generate z = Oif w = 0, andz = 1 ifw = 1. Continuing 
the partitioning procedure gives the remaining partitions 


P3 = (AB)(D)(G)(CE)(F) 
P4 = (A)(B)(D)(G)(CE)(F) 
Ps=Py 


The result is an FSM that is specified by six states. 
Next consider the alternative of assuming that both unspecified outputs in Figure 6.59 
have a value of 1. This would lead to the partitions 


6.7. DESIGN OF A COUNTER USING THE SEQUENTIAL CIRCUIT APPROACH 


P, = (ABCDEFG) 

P» = (AD)(BCEFG) 

P3 = (AD)(B)(CEFG) 
P4 = (AD)(B)(CEG)(F) 
Ps = P4 


This solution involves four states. Evidently, the choice of values assigned to unspecified 
outputs is of considerable importance. 

We will not pursue the issue of state minimization of incompletely specified FSMs any 
further. As we already mentioned, it is possible to develop a minimization technique that 
searches for equivalent states based directly on Definition 6.1. This approach is described 
in many books on logic design [2, 8-10, 12-14]. 

Finally, it is important to mention that reducing the number of states in a given FSM 
will not necessarily lead to a simpler implementation. Interestingly, the effect of state 
assignment, discussed in Section 6.2, may have a greater influence on the simplicity of 
implementation than does the state minimization. In a modern design environment, the 
designer relies on the CAD tools to implement state machines efficiently. 
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6.7 DESIGN OF A COUNTER USING THE SEQUENTIAL 
CIRCUIT APPROACH 


In this section we discuss the design of a counter circuit as a finite state machine. From 
Chapter 5 we already know that counters can be realized as cascaded stages of flip-flops 
and some gating logic, where each stage divides the number of incoming pulses by two. To 
keep our example simple, we choose a counter of small size but also show how the design 
can be extended to larger sizes. The specification for the counter is 


e The counting sequence is 0, 1,2,...,6,7,0,1,... 


e There exists an input signal w. The value of this signal is considered during each clock 
cycle. Ifw = 0, the present count remains the same; if w = 1, the count is incremented. 


The counter can be designed as a synchronous sequential circuit using the design techniques 
introduced in the previous sections. 


6.7.1 STATE DIAGRAM AND STATE TABLE FOR A MODULO-8 COUNTER 


Figure 6.60 gives a state diagram for the desired counter. There is a state associated with 
each count. In the diagram state A corresponds to count 0, state B to count 1, and so on. We 
show the transitions between the states needed to implement the counting sequence. Note 
that the output signals are specified as depending only on the state of the counter at a given 
time, which is the Moore model of sequential circuits. 

The state diagram may be represented in the state-table form as shown in Figure 6.61. 
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Figure 6.60 = State diagram for the counter. 


Present Next state Output 
State w=0 wel 
A A B 0 
B B C 1 
C C D 2 
D D E 3 
E E F 4 
F F G 5 
G G H 6 
H H A 7 


Figure 6.61 — State table for the counter. 


6.7.2 STATE ASSIGNMENT 


Three state variables are needed to represent the eight states. Let these variables, denoting 
the present state, be called y2, y;, and yo. Let Yo, Y;, and Yo denote the corresponding 
next-state functions. The most convenient (and simplest) state assignment is to encode 
each state with the binary number that the counter should give as output in that state. Then 
the required output signals will be the same as the signals that represent the state variables. 
This leads to the state-assigned table in Figure 6.62. 


6.7. DESIGN OF A COUNTER USING THE SEQUENTIAL CIRCUIT APPROACH 


Next state 
Present Gonnt 
oun 
state w=0 wel 
Y2¥1 No 29% 1% 


Y2¥i¥o Yo¥i¥o 


A 000 000 001 000 
B 001 001 010 001 
C 010 010 O11 010 
D O11 O11 100 O11 
E 100 100 101 100 
F 101 101 110 101 
G 110 110 111 110 
H 111 111 000 111 


Figure 6.62 — State-assigned table for the counter. 


The final step in the design is to choose the type of flip-flops and derive the expressions 
that control the flip-flop inputs. The most straightforward choice is to use D-type flip-flops. 
We pursue this approach first. Then we show the alternative of using JK-type flip-flops. 


6.7.3 IMPLEMENTATION USING D-TyPeE FLie-FLOPS 


When using D-type flip-flops to realize the finite state machine, each next-state function, 
Y;, is connected to the D input of the flip-flop that implements the state variable y;. The 
next-state functions are derived from the information in Figure 6.62. Using Karnaugh maps 
in Figure 6.63, we obtain the following implementation 


Do = Yo = Wyo + WY 


D, = Y; = wy + yi¥o + wyoy1 
Dz = Y2 = Wy2 + Yoy2 + Yiy2 + wyoyiY2 
The resulting circuit is given in Figure 6.64. It is not obvious how to extend this circuit to 
implement a larger counter, because no clear pattern is discernible in the expressions for 
Do, D, and Dz. However, we can rewrite these expressions as follows 
Do = Wyo + WYo 
=Ww@®yo 


D, = Wwyi + y1¥o + Wyo) 
= (W+Yo)y1 + wyoy, 
= Wyoy1 + wyoy, 
=wyo OY 
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Yio Y1¥0 


WY Wy 


oo Ol 11 10 00 Ol 11 10 


Yo = Wyo two Y, = Wy, +yiYotwyoy, 


WY 


Yo = WYot YoYat ViY2 t+ WYO Y2 


Figure 6.63 = Karnaugh maps for D flip-flops for the counter. 


Dy = Wy2 + Voy2 + Viy2 + WoV1Y2 
= (W+Yo + Yi )y2 + wyoyiY2 
= Wy0yy2 + WY0YIY> 
= wyoy1 ® y2 


Then an obvious pattern emerges, which leads to the circuit in Figure 5.23. 


6.7.4 IMPLEMENTATION USING JK-TYPE FLIP-FLOPS 


JK-type flip-flops provide an attractive alternative. Using these flip-flops to implement the 
sequential circuit specified in Figure 6.62 requires derivation of J and K inputs for each 
flip-flop. The following control is needed: 


e Ifa flip-flop in state 0 is to remain in state 0, then J = 0 and K = d (where d means 
that K can be equal to either 0 or 1). 


e Ifa flip-flop in state 0 is to change to state 1, then J = 1 and K = d. 
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YO 


y1 


y2 


Clock 


Resetn 


Figure 6.64 = Circuit diagram for the counter implemented with D flip-flops. 


e Ifa flip-flop in state 1 is to remain in state 1, then J = d and K = 0. 
e Ifa flip-flop in state 1 is to change to state 0, then J = d and K = 1. 


Following these guidelines, we can create a truth table that specifies the required 
values of the J and K inputs for the three flip-flops in our design. Figure 6.65 shows a 
modified version of the state-assigned table in Figure 6.62, with the J and K input functions 
included. To see how this table is derived, consider the first row in which the present state is 
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Flip-flop inputs 


Present Gsuni 
oun 

state w= 0 w= l 
Y21 Vo 2% 1% 


YoYiYo | oK2x Ji Ki JoKo | YoYiYo | Kx SK, JnKo 


000 000 Od Od Od 001 Od Od Id 000 
001 001 Od Od d0 010 Od ld dl 001 
010 010 Od d0 Od O11 Od d0 Id 010 
O11 O11 Od d0 d0 100 Id dl dl O11 
100 100 d0 Od Od 101 d0 Od Id 100 


101 101 d0 Od d0 110 d0 ld dl 101 
110 110 d0 d0 Od 111 d0 d0 Id 110 
111 111 d0 d0 dd 000 dl dl dl 111 


Figure 6.65 —_ Excitation table for the counter with JK flip-flops. 


yoyiyo = 000. If w = 0, then the next state is also Y2Y; Yo = 000. Thus the present value of 
each flip-flop is 0, and it should remain 0. This implies the control J = 0 and K = d forall 
three flip-flops. Continuing with the first row, if w = 1, the next state will be Y2Y, Yo = 001. 
Thus flip-flops y2 and y, still remain at 0 and have the control J = 0 and K = d. However, 
flip-flop yp must change from 0 to 1, which is accomplished with J = 1 and K = d. The 
rest of the table is derived in the same manner by considering each present state yy; yo and 
providing the necessary control signals to reach the new state Y2Yj Yo. 

A state-assigned table is essentially the state table in which each state is encoded using 
the state variables. When D flip-flops are used to implement an FSM, the next-state entries 
in the state-assigned table correspond directly to the signals that must be applied to the 
D inputs. This is not the case if some other type of flip-flops is used. A table that gives 
the state information in the form of the flip-flop inputs that must be “excited” to cause the 
transitions to the next states is usually called an excitation table. The excitation table in 
Figure 6.65 indicates how JK flip-flops can be used. In many books the term excitation 
table is used even when D flip-flops are involved, in which case it is synonymous with the 
state-assigned table. 

Once the table in Figure 6.65 has been derived, it provides a truth table with inputs ys, 
y1, yo, and w, and outputs Jo, K2, Jj, Ki, Jo, and Ko. We can then derive expressions for 
these outputs as shown in Figure 6.66. The resulting expressions are 


Jo = Kyo =w 
J, = Ki = wyo 
Jz = Ky = wyoy1 
This leads to the circuit shown in Figure 6.67. It is apparent that this design can be extended 


easily to larger counters. The pattern J, = K, = wyoy1--- Yn—1 defines the circuit for each 
stage in the counter. Note that the size of the AND gate that implements the product term 


6.7. DESIGN OF A COUNTER USING THE SEQUENTIAL CIRCUIT APPROACH 389 


Yio Yio 
Wyo 


WY> wy 


Wyo wy 


Ky = wygyy 


Figure 6.66 — Karnaugh maps for JK flip-flops in the counter. 
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Clock 


Resetn 


Figure 6.67 — Circuit diagram using JK flip-flops. 


yoy: + Yn—1 grows with successive stages. A circuit with a more regular structure can be 
obtained by factoring out the previously needed terms as we progress through the stages of 
the counter. This gives 


Jy = Ky = (wyo)y1 =Jiy\ 
Jn = Ky = (wyo °° Yn Vn l= Jn 1¥n-1 


Using the factored form, the counter circuit can be realized as indicated in Figure 6.68. In 
this circuit all stages (except the first) look the same. Note that this circuit has the same 
structure as the circuit in Figure 5.22 because connecting the J and K inputs of a flip-flop 
together turns the flip-flop into a T flip-flop. 


6.7.5 EXAMPLE—A DIFFERENT COUNTER 


Having considered the design of an ordinary counter, we will now apply this knowl- 
edge to design a slightly different counterlike circuit. Suppose that we wish to derive 
a three-bit counter that counts the pulses on an input line, w. But instead of displaying the 
count as 0, 1, 2,3, 4,5, 6, 7,0, 1,..., this counter must display the count in the sequence 
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Clock 


Resetn 


Figure 6.68 — Factored-form implementation of the counter. 


0,4, 2,6, 1,5, 3,7, 0,4, and so on. The count is to be represented directly by the flip-flop 
values themselves, without using any extra gates. Namely, Count = Q,Q,Qy. 

Since we wish to count the pulses on the input line w, it makes sense to use w as the 
clock input to the flip-flops. Thus the counter circuit should always be enabled, and it 
should change its state whenever the next pulse on the w line appears. The desired counter 
can be designed in a straightforward manner using the FSM approach. Figures 6.69 and 
6.70 give the required state table and a suitable state assignment. Using D flip-flops, we 
obtain the next-state equations 


Dr = Y. =Y> 

Di =Y%=y1 Py2 

Do = Yo = yoy; + Yo¥2 + Yoyiy2 
= yo(¥, + Y2) + Yoyiy2 
= yo ® yiyo 


This leads to the circuit in Figure 6.71. 
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Present | Next | Output 
state state 222120 


A 000 


100 
010 
110 
001 
101 
O11 
111 


ToOMMmMoIUADW 
>manmtmMoAaAD 


Figure 6.69 — State table for the counterlike example. 


Present | Next | Output 
state state 


yoyiyo | Y2¥1Yo | 222120 


000 100 000 
100 010 100 
010 110 010 
110 001 110 
001 101 001 
101 O11 101 
O11 111 O11 
111 000 111 


Figure 6.70 —State-assigned table for Figure 6.69. 


The reader should compare this circuit with the normal up-counter in Figure 5.23. Take 
the first three stages of that counter, set the Enable input to 1, and let Clock = w. Then 
the two circuits are essentially the same with one small difference in the order of bits in 
the count. In Figure 5.23 the top flip-flop corresponds to the least-significant bit of the 
count, whereas in Figure 6.71 the top flip-flop corresponds to the most-significant bit of 
the count. This is not just a coincidence. In Figure 6.70 the required count is defined 
as Count = yry1yo. However, if the bit patterns that define the states are viewed in the 
reverse order and interpreted as binary numbers, such that Count = yoy yz, then the states 
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Figure 6.71 — Circuit for Figure 6.70. 


A, B, C,..., H have the values 0, 1, 2,..., 7. These values are the same as the values that 
are associated with the normal three-bit up-counter. 
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6.8 FSM AS AN ARBITER CIRCUIT 


In this section we present the design of an FSM that is slightly more complex than the 
previous examples. The purpose of the machine is to control access by various devices 
to a shared resource in a given system. Only one device can use the resource at a time. 
Assume that all signals in the system can change values only following the positive edge of 
the clock signal. Each device provides one input to the FSM, called a request, and the FSM 
produces a separate output for each device, called a grant. A device indicates its need to use 
the resource by asserting its request signal. Whenever the shared resource is not already in 
use, the FSM considers all requests that are active. Based on a priority scheme, it selects 
one of the requesting devices and asserts its grant signal. When the device is finished using 
the resource, it deasserts its request signal. 
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We will assume that there are three devices in the system, called device 1, device 2, 
and device 3. It is easy to see how the FSM can be extended to handle more devices. The 
request signals are named 7, 72, and r3, and the grant signals are called g1, g2, and g3. The 
devices are assigned a priority level such that device | has the highest priority, device 2 has 
the next highest, and device 3 has the lowest priority. Thus if more than one request signal 
is asserted when the FSM assigns a grant, the grant is given to the requesting device that 
has the highest priority. 

A state diagram for the desired FSM, designed as a Moore-type machine, is depicted 
in Figure 6.72. Initially, on reset the machine is in the state called Idle. No grant signals 
are asserted, and the shared resource is not in use. There are three other states, called gntl, 
gnt2, and gnt3. Each of these states asserts the grant signal for one of the devices. 

The FSM remains in the /dle state as long as all of the request signals are 0. In the 
state diagram the condition r,72r3 = 000 is indicated by the arc labeled 000. When one or 
more request signals become 1, the machine moves to one of the grant states, according to 
the priority scheme. If 7; is asserted, then device 1 will receive the grant because it has 
the highest priority. This is indicated by the arc labeled 1 xx that leads to state gnt1, which 
sets g; = 1. The meaning of I xx is that the request signal r; is 1, and the values of signals 


Figure 6.72 — State diagram for the arbiter. 


6.8 FSM As AN ARBITER CIRCUIT 


r2 and r3 are irrelevant because of the priority scheme. As before, we use the symbol x 
to indicate that the value of the corresponding variable can be either 0 or 1. The machine 
stays in state gntl as long as r; is 1. When 7; = 0, the arc labeled Oxx causes a change on 
the next positive clock edge back to state Idle, and g; is deasserted. If other requests are 
active at this time, then the FSM will change to a new grant state after the next clock edge. 

The arc that causes a change to state gnf2 is labeled 01x. This label adheres to the 
priority scheme because it represents the condition that r2 = 1, but r; = 0. Similarly, the 
condition for entering state gnt3 is given as 001, which indicates that the only request signal 
asserted is r3. 

The state diagram is repeated in Figure 6.73. The only difference between this diagram 
and Figure 6.72 is the way in which the arcs are labeled. Figure 6.73 uses a simpler labeling 
scheme that is more intuitive. For the condition that leads from state Idle to state gnt1, the 
arc is labeled as 7), instead of 1xx. This label means that if r; = 1, the FSM changes to 
state gntl, regardless of the other inputs. The arc with the label 7172 that leads from state 
Idle to gnt2 represents the condition r;r2 = 01, while the value of r3 is irrelevant. There is 
no standardized scheme for labeling the arcs in state diagrams. Some designers prefer the 
style of Figure 6.72, while others prefer a style more similar to Figure 6.73. 


Figure 6.73 —_ Alternative style of state diagram for the arbiter. 
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Figure 6.74 gives the Verilog code for the machine. The three request and grant signals 
are specified as three-bit vectors r and g. The FSM transitions are described using a case 
statement in the style used for Figure 6.29. In the state /dle, the required priority scheme is 
described by using a nested casex statement. If r; = 1, then the next state for the machine 
is gntl. If 7; is not asserted, then the alternative is evaluated, which stipulates that ifr. = 1, 


module arbiter (r, Resetn, Clock, g); 
input [1:3] r; 
input Resetn, Clock; 
output wire [1:3] g; 
reg [2:1] y, Y; 
parameter Idle = 2’b00, gntl = 2’b01, gnt2 = 2’b10, gnt3 = 2’b11; 


// Next state combinational circuit 
always @(r, y) 
case (y) 
Idle: casex (r) 
3°b000: Y = Idle; 
3’blxx: Y=gntl; 
3°bOIx: Y= gnt2; 
3°b001: YY = gnt3; 
default: Y = Idle; 
endcase 
gntl: if (r[1]) Y=gntl; 
else Y = Idle; 
gnt2: if (r[2]) Y =gnt2; 
else Y = Idle; 
gnt3: if (r[3]) Y =gnt3; 
else Y = Idle; 
default: Y = Idle; 
endcase 


// Sequential block 

always @(posedge Clock) 
if (Resetn ==0) y <=Idle; 
else y<=Y; 


// Define output 

assign g[1] = (y == gntl); 
assign g[2] = (y == gnt2); 
assign g[3] = (y == gnt3); 


endmodule 


Figure 6.74 Verilog code for the arbiter. 
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then the next state will be gnt2. Each successive alternative considers a lower-priority 
request signal only if all of the higher-priority request signals are not asserted. 

The transitions for each grant state are straightforward. The FSM stays in state gntl 
as long as r; = 1; when r; = 0, the next state is Idle. The other grant states have the same 
structure. 

The grant signals, g;, g2, and g3 are defined at the end. The value of g; is set to 1 when 
the machine is in state gntl, and otherwise g, is set to 0. Similarly, each of the other grant 
signals is | only in the appropriate grant state. 
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In addition to knowing how to design a synchronous sequential circuit, the designer has to 
be able to analyze the behavior of an existing circuit. The analysis task is much simpler 
than the synthesis task. In this section we will show how analysis may be performed. 

To analyze a circuit, we simply reverse the steps of the synthesis process. The outputs 
of the flip-flops represent the present-state variables. Their inputs determine the next state 
that the circuit will enter. From this information we can construct the state-assigned table 
for the circuit. This table leads to a state table and the corresponding state diagram by 
giving a name to each state. The type of flip-flops used in the circuit is a factor, as we will 
see in the examples that follow. 


D-TYPE FLIP-FLOPS Figure 6.75 gives an FSM that has two D flip-flops. Let y; and yz be 
the present-state variables and Y; and Y> the next-state variables. The next-state and output 
expressions are 


Y; = wy, + wy2 
Y. = wy, + wy2 
f= yiy2 


Since there are two flip-flops, the FSM has at most four states. A good starting point in 
the analysis is to assume an initial state of the flip-flops such as y; = y2 = 0. From the 
expressions for Y; and Y2, we can derive the state-assigned table in Figure 6.76a. For 
example, in the first row of the table yj) = y2 = 0. Then w = 0 causes Y; = Y2 = 0, and 
w = | causes Y; = | and Y2 = 0. The output for this state is z = 0. The other rows are 
derived in the same manner. Labeling the states as A, B, C, and D yields the state table 
in Figure 6.76b. From this table it is apparent that following the reset condition the FSM 
produces the output z = 1 whenever three consecutive Is occur on the input w. Therefore, 
the FSM acts as a sequence detector for this pattern. 


Example 6.9 
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Figure 6.75 — Circuit for Example 6.9. 


Next state 
Present Gore 
state w=0 wel a 
21 YoY, Yo¥1 
00 00 01 0 
01 00 10 0 
10 00 11 0 
11 00 11 1 


(a) State-assigned table 


Present Next state Output 


state 


w=0 wel 


GVawPer 
>>> Pp 
VUAD 


(b) State table 


Figure 6.76 Tables for the circuit in Figure 6.75. 
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JK-TYPE FLIP-FLOPS Nowconsider the circuit in Figure 6.77, which has two JK flip-flops. Example 6.10 
The expressions for the inputs to the flip-flops are 


J, =w 
K,=W+5, 
Jn = wy 
Kx =Ww 


The output is given by z = y yo. 

From these expressions we can derive the excitation table in Figure 6.78. Interpreting 
the entries in this table, we can construct the state-assigned table. For example, consider 
yoy, = 00 and w = 0. Then, since Jz = J; = 0 and K2 = K; = 1, both flip-flops will re- 
main in the 0 state; hence Yo = Y; = 0. If yoy; = 00 and w = 1, then Jz = Ky = O and 
J, = K, = 1, which leaves the y, flip-flop unchanged and sets the y, flip-flop to 1; hence 
Y2 = O and Y; = 1. If yoy; = 01 and w = 0, then Jz = J; = 0 and Ky = K; = 1, which 
resets the y; flip-flop and results in the state yoy; = 00; hence Y2 = Y; = 0. Similarly, if 
yoy; = Ol andw = 1, thenJ2 = 1 and K2 = Osets y2 to 1; hence Y2 = 1, while J; = K, = 1 
toggles y,;; hence Y; = 0. This leads to the state y»y,; = 10. Completing this process, we 
find that the resulting state-assigned table is the same as the one in Figure 6.76a. The 
conclusion is that the circuits in Figures 6.75 and 6.77 implement the same FSM. 


Resetn 


Figure 6.77 — Circuit for Example 6.10. 
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Present Flip-flop inputs 
state w=0 w=l Output 
Yo JnK2 J Ki JK NK, Zz 
00 01 01 00 11 0 
01 01 oO1 10 11 0 
10 01 01 00 10 0 
11 01 Ol 10 10 1 


Figure 6.78 The excitation table for the circuit in Figure 6.77. 


Figure 6.79 —= Circuit for Example 6.11. 


Example 6.11 MIXED FLIP-FLOPS There is no reason why one cannot use a mixture of flip-flop types in 
one circuit. Figure 6.79 shows a circuit with one D and one T flip-flop. The expressions 
for this circuit are 


D, = w(, + y2) 
Ty = Wy2 + wy1 Yo 


z= yiy2 
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Present | Flip-flop inputs 
state w=0 w=1 Output 
21 | BD, THD ss 
00 00 01 0 
10 10 01 0 
11 10 01 1 


Figure 6.80 ~The excitation table for the circuit in Figure 6.79. 


From these expressions we derive the excitation table in Figure 6.80. Since it is a T flip- 
flop, y2 changes its state only when 7, = 1. Thus if yoy; = 00 and w = 0, then because 
T, = D,; = 0 the state of the circuit will not change. An example of where 7, = 1 is when 
yoy, = 01 and w = 1, which causes y, to change to 1; D; = 0 makes y; = 0, hence Y; = 1 
and Y; = 0. The other cases where 72 = 1 occur when w = 0 and yoy; = 10 or 11. In 
both of these cases D; = 0. Hence the T flip-flop changes its state from | to 0, while the D 
flip-flop is cleared, which means that the next state is YY; = 00. Completing this analysis 
we again obtain the state-assigned table in Figure 6.76a. Thus this circuit is yet another 
implementation of the FSM represented by the state table in Figure 6.76b. 
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The state diagrams and tables used in this chapter are convenient for describing the behavior 
of FSMs that have only a few inputs and outputs. For larger machines the designers often 
use a different form of representation, called the algorithmic state machine (ASM) chart. 

An ASM chart is a type of flowchart that can be used to represent the state transitions 
and generated outputs for an FSM. The three types of elements used in ASM charts are 
depicted in Figure 6.81. 


e State box — A rectangle represents a state of the FSM. It is equivalent to a node in the 
state diagram or a row in the state table. The name of the state is indicated outside the 
box in the top-left corner. The Moore-type outputs are listed inside the box. These are 
the outputs that depend only on the values of the state variables that define the state; 
we will refer to them simply as Moore outputs. It is customary to write only the name 
of the signal that has to be asserted. Thus it is sufficient to write z, rather than z = I, to 
indicate that the output z must have the value 1. Also, it may be useful to indicate an 
action that must be taken; for example, Count <- Count + | specifies that the contents 
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State name | | 


Output signals 0 (False) 


1 (True) 


Condition 
expression 


or actions 
(Moore type) 


(a) State box (b) Decision box 


Conditional outputs 
or actions (Mealy type) 


(c) Conditional output box 


Figure 6.81 — Elements used in ASM charts. 


of a counter have to be incremented by 1. Of course, this is just a simple way of saying 
that the control signal that causes the counter to be incremented must be asserted. We 
will use this way of specifying actions in larger systems that are discussed in Chapter 7. 


e Decision box — A diamond indicates that the stated condition expression is to be tested 
and the exit path is to be chosen accordingly. The condition expression consists of 
one or more inputs to the FSM. For example, w indicates that the decision is based 
on the value of the input w, whereas w, - w2 indicates that the true path is taken if 
W = W2 = | and the false path is taken otherwise. 


¢ Conditional output box — An oval denotes the output signals that are of Mealy type. 
These outputs depend on the values of the state variables and the inputs of the FSM; 
we will refer to these outputs simply as Mealy outputs. The condition that determines 
whether such outputs are generated is specified in a decision box. 


Figure 6.82 gives the ASM chart that represents the FSM in Figure 6.3. The transitions 
between state boxes depend on the decisions made by testing the value of the input variable 
w. In each case if w = 0, the exit path from a decision box leads to state A. If w = 1, then 
a transition from A to B or from B to C takes place. If w = | in state C, then the FSM 
stays in that state. The chart specifies a Moore output z, which is asserted only in state C, 
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Reset 


Figure 6.82 = ASM chart for the FSM in Figure 6.3. 


as indicated in the state box. In states A and B, the value of z is 0 (not asserted), which is 
implied by leaving the corresponding state boxes blank. 

Figure 6.83 provides an example with Mealy outputs. This chart represents the FSM 
in Figure 6.23. The output, z, is equal to 1 when the machine is in state B and w = 1. This 
is indicated using the conditional output box. In all other cases the value of z is 0, which 
is implied by not specifying z as an output of state B for w = 0 and state A for w equal to 0 
or |. 

Figure 6.84 gives the ASM chart for the arbiter FSM in Figure 6.73. The decision box 
drawn below the state box for dle specifies that if r; = 1, then the FSM changes to state 
gntl. In this state the FSM asserts the output signal g;. The decision box to the right of 
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Reset 


Figure 6.83 = ASM chart for the FSM in Figure 6.23. 


the state box for gnt/ specifies that as long as 7; = 1, the machine stays in state gnt/, and 
when r; = 0, it changes to state Idle. The decision box labeled r, that is drawn below the 
state box for Idle specifies that if r2 = 1, then the FSM changes to state gnt2. This decision 
box can be reached only after first checking the value of 7; and following the arrow that 
corresponds to r; = 0. Similarly, the decision box labeled 73 can be reached only if both r; 
and rz have the value 0. Hence the ASM chart describes the required priority scheme for 
the arbiter. 

ASM charts are similar to traditional flowcharts. Unlike a traditional flowchart, the 
ASM chart includes timing information because it implicitly specifies that the FSM changes 
(flows) from one state to another only after each active clock edge. The examples of ASM 
charts presented here are quite simple. We have used them to introduce the ASM chart 
terminology by giving examples of state, decision, and conditional-output boxes. Another 
term sometimes applied to ASM charts is ASM block, which refers to a single state box and 
any decision and conditional-output boxes that the state box may be connected to. The ASM 
charts can be used to describe complex circuits that include one or more finite state machines 
and other circuitry such as registers, shift registers, counters, adders, and multipliers. We 
will use ASM charts as an aid for designing more complex circuits in Chapter 7. 
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Reset 


gntl 


CoR<S: 


gnt2 


oS: 


gnt3 


Pa <> 


Figure 6.84 = ASM chart for the arbiter FSM in Figure 6.73. 


405 


6.11 FORMAL MODEL FOR SEQUENTIAL CIRCUITS 


This chapter has presented the synchronous sequential circuits using a rather informal 
approach because this is the easiest way to grasp the concepts that are essential in designing 
such circuits. The same topics can also be presented in a more formal manner, which has 
been the style adopted in many books that emphasize the switching theory aspects rather 
than the design using CAD tools. A formal model often gives a concise specification that 
is difficult to match in a more descriptive presentation. In this section we will describe a 
formal model that represents a general class of sequential circuits, including those of the 
synchronous type. 

Figure 6.85 represents a general sequential circuit. The circuit has W = {wi, wo, ..., 
w,} inputs, Z = {Z1, Z2,..., Zm} outputs, y = {y1, y2,..., ye} present-state variables, and 
Y = {Y;, Yo,..., Y,} next-state variables. It can have up to 2* states, S = {S,, So, ..., Sox}. 
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Ww, ——e zy 
Inputs Outputs 
w Combinational Zn 
circuit 
Present-state Next-state 
variables variables 


Figure 6.85 — The general model for a sequential circuit. 


There are delay elements in the feedback paths for the state-variables which ensure that y 
will take the values of Y after a time delay A. In the case of synchronous sequential 
circuits, the delay elements are flip-flops, which change their state on the active edge of a 
clock signal. Thus the delay A is determined by the clock period. The clock period must 
be long enough to allow for the propagation delay in the combinational circuit, in addition 
to the setup and hold parameters of the flip-flops. 

Using the model in Figure 6.85, a synchronous sequential circuit, M, can be defined 
formally as a quintuple 


M =(W,Z,S, 9,4) 
where 


e W, Z, and S are finite, nonempty sets of inputs, outputs, and states, respectively. 
e is the state transition function, such that $(¢ + 1) = g[W(), S(O). 


e i is the output function, such that A(t) = A[S(f)] for the Moore model and A(t) = 
AL[W (t), S(t)] for the Mealy model. 


This definition assumes that the time between f and t + | is one clock cycle. 

We will see in Chapter 9 that the delay A need not be controlled by a clock. In 
asynchronous sequential circuits the delays are due solely to the propagation delays through 
various gates. 
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6.12 CONCLUDING REMARKS 


The existence of closed loops and delays in a sequential circuit leads to a behavior that is 
characterized by the set of states that the circuit can reach. The present values of the inputs 
are not the sole determining factor in this behavior, because a given valuation of inputs may 
cause the circuit to behave differently in different states. 

The propagation delays through a sequential circuit must be taken into account. The 
design techniques presented in this chapter are based on the assumption that all changes in 
the circuit are triggered by the active edge of a clock signal. Such circuits work correctly 
only if all internal signals are stable when the clock signal arrives. Thus the clock period 
must be longer than the longest propagation delay in the circuit. 

Synchronous sequential circuits are used extensively in practical designs. They are 
supported by the commonly used CAD tools. All textbooks on the design of logic circuits 
devote considerable space to synchronous sequential circuits. Some of the more notable 
references are [1-14]. 

In Chapter 9 we will present a different class of sequential circuits, which do not use 
flip-flops to represent the states of the circuit and do not use clock pulses to trigger changes 
in the states. 


6.13 EXAMPLES OF SOLVED PROBLEMS 


This section presents some typical problems that the reader may encounter, and shows how 
such problems can be solved. 


Problem: Design an FSM that has an input w and an output z. The machine is a sequence 
detector that produces z = | when the previous two values of w were 00 or 11; otherwise 
z=0. 


Solution: Section 6.1 presents the design of an FSM that detects the occurrence of con- 
secutive ls. Using the same approach, the desired FSM can be specified using the state 
diagram in Figure 6.86. State C denotes the occurrence of two or more Os, and state E 
denotes two or more Is. The corresponding state table is shown in Figure 6.87. 

A straightforward state assignment leads to the state-assigned table in Figure 6.88. The 
codes y3y2y; = 101, 110, 111 can be treated as don’t-care conditions. Then the next-state 
expressions are 


Y = wy ¥3 + WYoy3 + Wyiy2 + WY1V2 
Y2 = yiY2 + Yiy2 + wy2y3 
Y3 = wy3 + wy y2 


The output expression is 


Z=y3+YViy2 


Example 6.12 
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Figure 6.86 = State diagram for Example 6.12. 


Present Next state Output 
state w=0 wel Z 
A B D 0 
B C D 0 
C C D 1 
D B E 0 
E B E 1 


Figure 6.87 — State table for the FSM in Figure 6.86. 


These expressions seem to be unnecessarily complex, suggesting that we may attempt to 
find a better state assignment. Observe that state A is reached only when the machine is reset 
by means of the Reset input. So, it may be advantageous to assign the four codes in which 
y3 = | to the states B, C, D, and E. The result is the state-assigned table in Figure 6.89. 
From it, the next-state and output expressions are 


Y, = wy2 + Wy3y 


Y=w 
Y3=1 
£=Y1 


Since Y3 = 1, we do not need the flip-flop y3, and the expression for Y; simplifies to w @ yo. 
This is a much better solution. 
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Next state 
Present Can 
state w=0 wel a 
*39271 Y3¥2¥1 Y3Y2YV1 i 
A 000 001 O11 0 
B 001 010 O11 0 
C 010 010 O11 1 
D 011 001 100 0 
E 100 001 100 1 


Figure 6.88 —State-assigned table for the FSM in Figure 6.87. 


Next state 
Present Cutan 
state w=0 wel as 
¥3¥2V1 Y3Y2Y, = Y3Yo¥ 5 
A 000 100 110 0 
B 100 101 110 0 
C 101 101 110 1 
D 110 100 111 0 
E 111 100 111 1 


Figure 6.89 — An improved state assignment for the FSM in 
Figure 6.87. 


Problem: Implement the sequence detector of Example 6.12 by using two FSMs. One Example 6.13 
FSM detects the occurrence of consecutive 1s, while the other detects consecutive Os. 


Solution: A good realization of the FSM that detects consecutive 1s is given in Figure 6.17. 
The next-state and output expressions are 


Y, =Ww 
Y2 = wyi 
Zones = Y2 


A similar FSM that detects consecutive Os is defined in Figure 6.90. Its expressions are 
Y3=W 
Y4 = wy3 


Zeros = V4 
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Present Next state Output 
state w=0 wel | Zzeros 
D E D 0 
E F D 0 
F F D 1 
(a) State table 
pea Next state 
state w=0 we=1_ Output 
Y4Y3 Y4Y3 Y4¥3 Sap, 
D 00 01 00 0 
E 01 11 00 0 
F 11 11 00 1 
10 dd dd d 


(b) State-assigned table 


Figure 6.90 FSM that detects a sequence of two zeros. 


The output of the combined circuit is 


Z = Zones + Zzeros 


Example 6.14 


Problem: Derive a Mealy-type FSM that can act as a sequence detector described in 
Example 6.12. 


Solution: A state diagram for the desired FSM is depicted in Figure 6.91. The corresponding 
state table is presented in Figure 6.92. Two flip-flops are needed to implement this FSM. 
A state-assigned table is given in Figure 6.93, which leads to the next-state and output 
expressions 
Y,=1 
Y> =w 
Z = Wy1V2 + wy2 
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Reset 


Figure 6.91 — State diagram for Example 6.14. 
Present Next state Output z 
state |w=0 w=1/w=0 wel 
A B C 0 0 
B B C 1 0 
Cc B Cc 0 1 
Figure 6.92 — State table for the FSM in Figure 6.91. 
Present Next state Output 
state |w=0 w=1/w=0 wel 
Y2yI YY, YY Z z 
A 00 01 11 0 0 
B 01 01 11 1 0 
C 11 01 11 0 1 


Figure 6.93 


State-assigned table for the FSM in Figure 6.92. 
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Example 6.15 Problem: Implement the state-assigned table in Figure 6.89 using JK-type flip-flops. 


Solution: Figure 6.94 shows the excitation table. It results in the following next-state and 
output expressions 


J) = wy2 + Wy3yo 
K, = Wwy2 + wy1Y2 


Jn =W 
Kx =w 
jg = 1 
Kk; =0 
Z=Y1 


Example 6.16 Problem: Write Verilog code to implement the FSM in Figure 6.86. 


Solution: Using the style of code given in Figure 6.29, the required FSM can be specified 
as shown in Figure 6.95. 


Example 6.17 Problem: Write Verilog code to implement the FSM in Figure 6.91. 


Solution: Using the style of code given in Figure 6.36, the Mealy-type FSM can be speci- 
fied as shown in Figure 6.96. 


Example 6.18 Problem: In computer systems it is often desirable to transmit data serially, namely, one 
bit at a time, to save on the cost of interconnecting cables. This means that parallel data at 


Present Flip-flop inputs 


state w= 0 w= l 


z 
Y3Y2y1 Y3Yo¥, | 3K3 JoKo JK | Y3¥2¥, | 2K3  JoKo IK 


Output 


A 000 100 ld Od Od 110 Id ld Od 0 
B 100 101 d0 Od ld 110 d0 ld Od 0 
C 101 101 d0 Od dd 110 d0 ld dl 1 
D 110 100 d0 dl Od 111 d0 d0 Id 0 
E 111 100 d0 dl dl 111 d0 d0 d0 1 


Figure 6.94 _— Excitation table for the FSM in Figure 6.89 with JK flip-flops. 
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module sequence (Clock, Resetn, w, z); 
input Clock, Resetn, w; 
output z; 
reg [3:1] y, Y; 
parameter [3:1] A = 3’b000, B = 3’b001, C = 3’b010, D = 3’b011, E= 3’b100; 


// Define the next state combinational circuit 
always @(w, y) 
case (y) 

A: if(w) Y 

else Y 

B: if(w) Y 

else Y 

C: if(w) Y 
else Y= 

D: if(w) Y 

else Y 

E: if(w) Y 

else Y 

default: Y 
endcase 


// Define the sequential block 

always @(negedge Resetn, posedge Clock) 
if (Resetn ==0) y<=A; 
else y<=Y; 


// Define output 
assign z = (y ==C) | (y ==); 


endmodule 


Figure 6.95 Verilog code for the FSM in Figure 6.86. 


one end must be transmitted serially, and at the other end the received serial data has to be 
turned back into parallel form. Suppose that we wish to transmit ASCII characters in this 
manner. As explained in Chapter 1, the standard ASCII code uses seven bits to define each 
character. Usually, a character occupies one byte, in which case the eighth bit can either 
be set to 0 or it can be used to indicate the parity of the other bits to ensure a more reliable 
transmission. 

Parallel-to-serial conversion can be done by means of a shift register. Assume that a 
circuit accepts parallel data, B = b7, be, ..., bo, representing ASCII characters. Assume 
also that bit b7 is set to 0. The circuit is supposed to generate a parity bit, p, and send it 
instead of b7 as a part of the serial transfer. Figure 6.97 gives a possible circuit. An FSM is 
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module seqmealy (Clock, Resetn, w, z); 
input Clock, Resetn, w; 
output reg z; 
reg [2:1] y, Y; 
parameter [2:1] A =2’b00, B = 2’b01, C=2’b11; 


// Define the next state and output combinational circuits 
always @(w, y) 
case (y) 
A: if (w) 
begin 
z=0; Y=C; 
end 
else 
begin 
z=0; Y=B; 
end 
B: if (w) 
begin 
z=0; Y=C; 
end 
else 
begin 
z=1;Y=B; 
end 
C: if (w) 
begin 
Z=1;Y=C; 
end 
else 
begin 
z=0; Y=B; 
end 
default: 
begin 
z=0; Y = 2’bxx; 
end 
endcase 


// Define the sequential block 

always @(negedge Resetn, posedge Clock) 
if (Resetn ==0) y<=A; 
else y<=Y; 


endmodule 


Figure 6.96 Verilog code for the FSM in Figure 6.91. 
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Parallel input 


Load 


Clock 


Counter 


> 


Figure 6.97 —_Parallel-to-serial converter. 


used to generate the parity bit, which is included in the output stream by using a multiplexer. 
A three-bit counter is used to determine when the p bit is transmitted, which happens when 
the count reaches 7. Design the desired FSM. 


Solution: As the bits are shifted out of the shift register, the FSM examines the bits and 
keeps track of whether there has been an even or odd number of Is. It sets p to 1 if there is 
odd parity. Hence, the FSM must have two states. Figure 6.98 presents the state table, the 
state-assigned table, and the resulting circuit. The next state expression is 

Y =wy+wy 
The output p is just equal to y. 
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Present Next state Output 
state w=0 wel Pp 
Seven Seven Sodd 0 
Soda Sodd Seven 1 

(a) State table 
Next state 

Present 

state w=0 wel Output 
Y Y _ 
0 0) 1 0 
1 1 1 


(b) State-assigned table 


(c) Circuit 


Figure 6.98 — FSM for parity generation. 


PROBLEMS 


Answers to problems marked by an asterisk are given at the back of the book. 


An FSM is defined by the state-assigned table in Figure P6.1. Derive a circuit that realizes 
this FSM using D flip-flops. 


Derive a circuit that realizes the FSM defined by the state-assigned table in Figure P6.1 
using JK flip-flops. 


6.3 


6.4 
*6.5 


*6.6 


6.7 


6.8 


6.9 


6.10 
6.11 


PROBLEMS 417 


Present Next state 
state |w=0 we=1 Output 
YN] YoY; YoY z 
00 10 11 0 
01 01 00 0 
10 11 00 0 
11 10 01 1 


Figure P6.1 —State-assigned table for Problems 6.1 and 6.2. 


Derive the state diagram for an FSM that has an input w and an output z. The machine has 
to generate z = | when the previous four values of w were 1001 or 1111; otherwise, z = 0. 
Overlapping input patterns are allowed. An example of the desired behavior is 


w : 010111100110011111 
z : 000000100100010011 


Write Verilog code for the FSM described in Problem 6.3. 


Derive a minimal state table for a single-input and single-output Moore-type FSM that 
produces an output of | if in the input sequence it detects either 110 or 101 patterns. 
Overlapping sequences should be detected. 
Repeat Problem 6.5 for a Mealy-type FSM. 


Derive the circuits that implement the state tables in Figures 6.51 and 6.52. What is the 
effect of state minimization on the cost of implementation? 


Derive the circuits that implement the state tables in Figures 6.55 and 6.56. Compare the 
costs of these circuits. 


A sequential circuit has two inputs, w, and wo, and an output, z. Its function is to compare 
the input sequences on the two inputs. If w; = w2 during any four consecutive clock cycles, 
the circuit produces z = 1; otherwise, z = 0. For example 


w, : 0110111000110 
wz : 1110101000111 
z : 0000100001110 


Derive a suitable circuit. 
Write Verilog code for the FSM described in Problem 6.9. 


A given FSM has an input, w, and an output, z. During four consecutive clock pulses, 
a sequence of four values of the w signal is applied. Derive a state table for the FSM 
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that produces z = | when it detects that either the sequence w : 0010 or w: 1110 has been 
applied; otherwise, z = 0. After the fourth clock pulse, the machine has to be again in the 
reset state, ready for the next sequence. Minimize the number of states needed. 


Derive a minimal state table for an FSM that acts as a three-bit parity generator. For every 
three bits that are observed on the input w during three consecutive clock cycles, the FSM 
generates the parity bit p = | if and only if the number of Is in the three-bit sequence is 
odd. 


Write Verilog code for the FSM described in Problem 6.12. 


Draw timing diagrams for the circuits in Figures 6.43 and 6.47, assuming the same changes 
in a and b signals for both circuits. Account for propagation delays. 


Show a state table for the state-assigned table in Figure P6.1, using A, B, C, D for the four 
rows in the table. Give a new state-assigned table using a one-hot encoding. For A use the 
code y4y3y2y; = 0001. For states B, C, D use the codes 0010, 0100, and 1000, respectively. 
Synthesize a circuit using D flip-flops. 


Show how the circuit derived in Problem 6.15 can be modified such that the code yay3y2y, = 
0000 is used for the reset state, A, and the other codes for state B, C, D are changed as needed. 
(Hint: You do not have to resynthesize the circuit!) 


In Figure 6.59 assume that the unspecified outputs in states B and G are 0 and 1, respectively. 
Derive the minimized state table for this FSM. 


In Figure 6.59 assume that the unspecified outputs in states B and G are | and 0, respectively. 
Derive the minimized state table for this FSM. 


Derive circuits that implement the FSMs defined in Figures 6.57 and 6.58. Can you draw 
any conclusions about the complexity of circuits that implement Moore and Mealy types 
of machines? 


Design a counter that counts pulses on line w and displays the count in the sequence 
0, 2,1,3,0,2,.... Use D flip-flops in your circuit. 


Repeat Problem 6.20 using JK flip-flops. 
Repeat Problem 6.20 using T flip-flops. 


Design a modulo-6 counter, which counts in the sequence 0, 1, 2,3,4,5,0,1,.... The 
counter counts the clock pulses if its enable input, w, is equal to 1. Use D flip-flops in your 
circuit. 


Repeat Problem 6.23 using JK flip-flops. 
Repeat Problem 6.23 using T flip-flops. 


Design a three-bit counterlike circuit controlled by the input w. If w = 1, then the counter 
adds 2 to its contents, wrapping around if the count reaches 8 or 9. Thus if the present 
state is 8 or 9, then the next state becomes 0 or 1, respectively. If w = 0, then the counter 
subtracts | from its contents, acting as a normal down-counter. Use D flip-flops in your 
circuit. 
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Repeat Problem 6.26 using JK flip-flops. 
Repeat Problem 6.26 using T flip-flops. 


Derive the state table for the circuit in Figure P6.2. What sequence of input values on wire 
w is detected by this circuit? 


Figure P6.2 — Circuit for Problem 6.29. 


Write Verilog code for the FSM shown in Figure 6.57, using the style of code in Figure 6.29. 
Repeat Problem 6.30, using the style of code in Figure 6.34. 

Write Verilog code for the FSM shown in Figure 6.58, using the style of code in Figure 6.29. 
Repeat Problem 6.32, using the style of code in Figure 6.34. 

Write Verilog code for the FSM shown in Figure P6.1. 

Represent the FSM in Figure 6.57 in form of an ASM chart. 

Represent the FSM in Figure 6.58 in form of an ASM chart. 


The arbiter FSM defined in Section 6.8 (Figure 6.72) may cause device 3 to never get 
serviced if devices 1 and 2 continuously keep raising requests, so that in the Idle state it 
always happens that either device | or device 2 has an outstanding request. Modify the 
proposed FSM to ensure that device 3 will get serviced, such that if it raises a request, the 
devices | and 2 will be serviced only once before the device 3 is granted its request. 
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Write Verilog code for the FSM designed in Problem 6.37. 
Write Verilog code to specify the circuit in Figure 6.97. 


Section 6.5 presents a design for the serial adder. Derive a similar circuit that functions as 
a serial subtractor which produces the difference of operands A and B. (Hint: Use the rule 
for finding 2’s complements in Section 3.3.1 to generate the 2’s complement of B.) 


Write Verilog code that defines the serial subtractor designed in Problem 6.40. 


In Section 6.2 we stated that trying all possible state assignments in order to find the best 
one is impractical. Determine the number of possible state assignments for an FSM that 
has n states and for which k = loggn state variables are used. 
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CHAPTER OBJECTIVES 


In this chapter you will learn about aspects of digital system design, including 
e Bus structure 

e Asimple processor 

e Usage of ASM charts 

e Clock synchronization and timing issues 

e Flip-flop timing at the chip level 
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In the previous chapters we showed how to design many types of simple circuits, such as multiplexers, 
decoders, flip-flops, registers, and counters, which can be used as building blocks. In this chapter we provide 
examples of more complex circuits that can be constructed using the building blocks as subcircuits. Such 
larger circuits form a digital system. For practical reasons our examples of digital systems will not be large, 
but the design techniques presented are applicable to systems of any size. 

A digital system consists of two main parts, called the datapath circuit and the control circuit. The 
datapath circuit is used to store and manipulate data and to transfer data from one part of the system to 
another. Datapath circuits comprise building blocks such as registers, shift registers, counters, multiplexers, 
decoders, adders, and so on. The control circuit controls the operation of the datapath circuit. In Chapter 6 
we referred to the control circuits as finite state machines. We will give several examples of digital systems 
and show how to design their datapath and control circuits. 


| 7.1 Bus STRUCTURE 


Digital systems often contain a set of registers used to store data. As already discussed in 
Chapter 6, such registers can be interconnected via an interconnection network as illustrated 
in Figure 6.10. There are many ways of implementing an interconnection network. In this 
section we will present one of the commonly used schemes. 

Suppose that a digital system has a number of 77-bit registers, and that it must be possible 
to transfer data from any register to any other register. A simple way of providing the desired 
interconnectivity may be to connect each register to a common set of n wires, which are 
used to transfer data into and out of the registers. This common set of wires is usually 
called a bus. If common paths are used to transfer data from multiple sources to multiple 
destinations, it is necessary to ensure that only one register acts as a source at any given 
time and other registers do not interfere. We will present two plausible arrangements for 
implementing the bus structure. 


7.1.1 UsInG TRI-STATE DRIVERS TO IMPLEMENT A Bus 


Outputs of two ordinary logic gates cannot be connected together, because a short circuit 
would result if one gate forces the output value | while the other gate forces the value 0. 
Therefore, some special gates are needed if the outputs of two registers are to be connected 
to a common set of wires. A commonly used circuit element for this purpose is shown 
in Figure 7.1a. It has a data input w, an output f, and an enable input e. Its operation is 
illustrated by the equivalent circuit in part (b). The triangular symbol in the figure represents 
a noninverting driver, which is a circuit that performs no logic operation and its output 
simply replicates the input signal. Its purpose is to provide additional electrical driving 
capability. In conjunction with the output switch, it behaves as indicated in Figure 7.1c. 
When e = 1, the output reflects the logic value on the data input. But, whene = 0, the output 
is electrically disconnected from the data input, which is referred to as a high impedance 
state and it is usually denoted by the letter Z (or z). Because this circuit element exhibits 
three distinct states, 0, 1, and Z, itis called a tri-state driver (or buffer). Appendix B explains 
how it can be implemented with transistors. 


7.1. Bus STRUCTURE 


e=0 
e p>“) 


(a) Symbol (b) Equivalent circuit (c) Truth table 


Figure 7.1 _Tri-state driver. 


Consider a system that contains k n-bit registers, R1 to Rk. Figure 7.2 shows how these 
registers can be connected using tri-state drivers to implement the bus structure. The data 
outputs of each register are connected to tri-state drivers. When selected by their enable 
signals, the drivers place the contents of the corresponding register onto the bus wires. We 
showed in Figure 5.56 how an enable input can be added to a register. If the enable input 
is set to 1, then the contents of the register will be changed on the next active edge of the 
clock. In Figure 7.2 the enable input on each register is labeled L, which stands for load. 
The signal that controls the load input for a register Rj is denoted as Rjj,, while the signal 
that controls the associated tri-state driver is called Rjo,,. These signals are generated by a 
control circuit. 

In addition to registers, in areal system other types of circuit blocks would be connected 
to the bus. The figure shows how n bits of data from an external source can be placed on 
the bus, using the control input Extern. 

It is essential to ensure that only one circuit block attempts to place data onto the 
bus wires at any given time. The control circuit must ensure that only one of the tri-state 
driver enable signals, Rlou:,..., Rkouw, 18 asserted at a given time. The control circuit also 
produces the signals Rl jin, ..., Rkin, which determine when data is loaded into each register. 
In general, the control circuit could perform a number of functions, such as transferring the 
data stored in one register into another register and controlling the processing of data in 
various functional units of the system. Figure 7.2 shows an input signal named Function that 
instructs the control circuit to perform a particular task. The control circuit is synchronized 
by a clock input, which is the same clock signal that controls the k registers. 

Figure 7.3 provides a more detailed view of how the registers from Figure 7.2 can be 
connected to a bus. To keep the picture simple, 2 two-bit registers are shown, but the same 
scheme can be used for larger registers. For register R1, two tri-state drivers enabled by 
R144: are used to connect each flip-flop output to a wire in the bus. The D input on each 
flip-flop is connected to a 2-to-1 multiplexer, whose select input is controlled by Rlin. If 
R1j, = 0, the flip-flops are loaded from their Q outputs; hence the stored data does not 
change. But if R1;, = 1, data is loaded into the flip-flops from the bus. 

The system in Figure 7.2 can be used in many different ways, depending on the design 
of the control circuit and on how many registers and other circuit blocks are connected to 
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Data 


Extern 


Bus 


Control circuit 


Function 


Figure 7.2 — A digital system with k registers. 


the bus. As a simple example, consider a system that has three registers, R1, R2, and R3. 
We will specify a control circuit that performs a single function—it swaps the contents of 
registers R1 and R2, using R3 for temporary storage. 

The required swapping is done in three steps, each needing one clock cycle. In the first 
step the contents of R2 are transferred into R3. Then the contents of R1 are transferred into 
R2. Finally, the contents of R3, which are the original contents of R2, are transferred into 
R1. We have already designed the control circuit for this task in Example 6.1, in the form 
of a finite state machine. 

Note that we say that the contents of one register, R;, are “transferred” into another 
register, R;. This jargon is commonly used to indicate that the new contents of R; will be 
a copy of the contents of R;. The contents of R; are not changed as a result of the transfer. 
Therefore, it would be more precise to say that the contents of R; are “copied” into R;. 


7.1.2 UsiInG MULTIPLEXERS TO IMPLEMENT A Bus 


In Figure 7.2 we used tri-state drivers to control access to the bus. An alternative approach 
is to use multiplexers, as depicted in Figure 7.4. The outputs of each register are connected 
to a multiplexer. This multiplexer’s output is connected to the inputs of the registers, thus 
realizing the bus. The multiplexer select inputs determine which register’s contents appear 
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Figure 7.4 —_ Using multiplexers to implement a bus. 


on the bus. Although the figure shows just one multiplexer symbol, we actually need one 
multiplexer for each bit in the registers. For example, assume that there are 4 eight-bit 
registers, R1 to R4, plus the externally-supplied eight-bit Data. To interconnect them, we 
need eight 5-to-1 multiplexers. 

The control circuit can be designed as a finite state machine as mentioned in the previous 
subsection. However, instead of using the control signals Rjo,; to place the contents of 
register Rj onto the bus, we have to generate the select inputs for the multiplexers. 

The tri-state driver and multiplexer approaches for implementing a bus are both equally 
valid. However, some types of chips, such as FPGAs, do not contain a sufficient number of 
tri-state drivers to realize even moderately large buses. In such chips the multiplexer-based 
approach is the only practical alternative. In practice, circuits are designed with CAD tools. 
If the designer describes the circuit using tri-state drivers, but there are not enough such 
drivers in the target device, then the CAD tools automatically produce an equivalent circuit 
that uses multiplexers. 


7.1.3  VERILOG CODE FOR SPECIFICATION OF BUS STRUCTURES 


This section presents Verilog code for our circuit example that swaps the contents of two 
registers. We first give the code for the style of circuit in Figure 7.2 that uses tri-state drivers 
to implement the bus and then give the code for the style of circuit in Figure 7.4 that uses 
multiplexers. Figure 7.5 gives the code for an n-bit register of the type in Figure 7.3. The 
number of bits in the register is set by the parameter n, which has the default value of 8. 
The register is specified such that if the input L = 1, then the flip-flops are loaded from the 
n-bit input R. Otherwise, the flip-flops retain their presently stored values. 

Figure 7.6 gives the code for a subcircuit that represents n tri-state drivers, each enabled 
by the input Z. The inputs to the drivers are the n-bit signal Y, and the outputs are the n-bit 
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module regn (R, L, Clock, Q); 
parameter n = 8; 
input [n—1:0] R; 
input L, Clock; 
output reg [n—1:0] Q; 


always @(posedge Clock) 
if (L) 
Q<=R; 


endmodule 


Figure 7.5 = Code for an n-bit register of the type in 
Figure 7.3. 


module trin (Y, E, F); 
parameter n = 8; 
input [n—-1:0] Y; 
input E; 
output wire [n—1:0] F; 


assign F=E? Y: ’bz; 
endmodule 


Figure 7.6 = Code for an n-bit tri-state module. 


signal F. The conditional assignment statement specifies that the output of each driver is 
set to F = Y if E = 1; otherwise, the output is set to the high impedance value z. The 
conditional assignment statement uses an unsized number to define the high impedance 
case. The Verilog compiler will make the size of this number the same as the size of vector 
Y, namely n. We cannot define the number as n’bz because the size of a sized number 
cannot be given as a parameter. 

The code in Figure 7.7 represents a digital system like the one in Figure 7.2, with 3 
eight-bit registers, R1, R2, and R3. The circuit in Figure 7.2 includes tri-state drivers that 
are used to place n bits of externally supplied Data on the bus. In Figure 7.7, these drivers 
are instantiated in the module tri_ext. Each of the eight drivers is enabled by the input 
signal Extern, and the data inputs on the drivers are attached to the eight-bit signal Data. 
When Extern = 1, the value of Data is placed on the bus, which is represented by the signal 
BusWires. The BusWires vector represents the circuit’s output as well as the internal bus 
wiring. We declared this vector to be of tri type rather than of wire type. The keyword tri 
is treated in the same way as the keyword wire by the Verilog compiler. The designation tri 
makes it obvious to a reader that the synthesized connections will have tri-state capability. 


427 


428 CHAPTER 7 ce DIGITAL SYSTEM DESIGN 


module swap (Resetn, Clock, w, Data, Extern, RinExt1, RinExt2, RinExt3, BusWires, Done); 
parameter n= 8; 
input Resetn, Clock, w, Extern, RinExtl, RinExt2, RinExt3; 
input [n—1:0] Data; 
output tri [n—1:0] BusWires; 
output Done; 
wire [n—1:0] R1, R2, R3; 
wire Rlin, Rlout, R2in, R2out, R3in, R3out; 
reg [2:1] y, Y; 
parameter [2:1] A =2’b00, B = 2’b01, C=2’b10, D=2’b11; 


// Define the next state combinational circuit for FSM 

always @(w, y) 
case (y) 

A: if(w) Y 

else Y 

B: Y 

Y 

Y 


? 


B 

A 
C: 
D 
A 


C: : 
D: 


endcase 


// Define the sequential block for FSM 
always @(negedge Resetn, posedge Clock) 
if (Resetn==0) y<=A; 
else y<=Y; 


// Define outputs of FSM 
assign R2out = (y == B); 
assign R3in = (y == B); 
assign Rlout = (y ==C); 
assign R2in = (y == C); 
assign R3out = (y == D); 
assign Rlin = (y == D); 
assign Done = (y == D); 


// Instantiate registers 
regn reg_! (BusWires, RinExt1 | Rlin, Clock, R1); 
regn reg_2 (BusWires, RinExt2 | R2in, Clock, R2); 
regn reg_3 (BusWires, RinExt3 | R3in, Clock, R3); 
// Instantiate tri-state drivers 
trin tri_ext (Data, Extern, BusWires); 
trin tri_l (R1, Rlout, BusWires); 
trin tri_2 (R2, R2out, BusWires); 
trin tri_3 (R3, R3out, Bus Wires); 

endmodule 


Figure 7.7. — A digital system like the one in Figure 7.2. 


7.2. SIMPLE PROCESSOR 


We assume that three control signals named RinExtl, RinExt2, and RinExt3 exist, 
which allow the externally supplied data to be loaded from the bus into register R1, R2, or 
R3. These signals are not shown in Figure 7.2, to keep the figure simple, but they would 
be generated by the same external circuit block that produces Extern and Data. When 
RinExtl = 1, the data on the bus is loaded into register R1; when RinExt2 = 1, the data is 
loaded into R2; and when RinExt3 = 1, the data is loaded into R3. 

The FSM that implements the control circuit for our swapping task example is included 
in Figure 7.7. Since our control circuit performs only one operation, we have not included 
the Function input from Figure 7.2. We have assumed that an input signal named w exists, 
which is set to 1 for one clock cycle to start the swapping task. We have also assumed that 
at the end of the swapping task, which is indicated by the Done signal being asserted, the 
control circuit returns to the starting state. 


Verilog Code Using Multiplexers 

Figure 7.8 shows how the code in Figure 7.7 can be modified to use multiplexers instead 
of tri-state drivers. Using the circuit structure shown in Figure 7.4, the bus is implemented 
with eight 4-to-1 multiplexers. Three of the data inputs on each 4-to-1 multiplexer are 
connected to one bit from registers R1, R2, and R3. The fourth data input is connected 
to one bit of the Data input signal to allow externally supplied data to be written into the 
registers. 

The same FSM control circuit is used. However, the control signals Rlout, R2out, 
and R3out are not needed because tri-state drivers are not used. Instead, the required 
multiplexers are defined in an if-else statement by specifying the source of data based on 
the state of the FSM. Hence, when the FSM is in state A, the selected input to the multiplexers 
is Data. When the state is B, the register R2 provides the input data to the multiplexers, 
and so on. 
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A second example of a digital system like the one in Figure 7.2 is shown in Figure 7.9. 
It has four n-bit registers, RO, ..., R3, that are connected to the bus with tri-state drivers. 
External data can be loaded into the registers from the n-bit Data input, which is connected 
to the bus using tri-state drivers enabled by the Extern control signal. The system also 
includes an adder/subtractor module. One of its data inputs is provided by an n-bit register, 
A, that is attached to the bus, while the other data input, B, is directly connected to the bus. 
If the AddSub signal has the value 0, the module generates the sum A + B; if AddSub = 1, 
the module generates the difference A — B. To perform the subtraction, we assume that 
the adder/subtractor includes the required XOR gates to form the 2’s complement of B, as 
discussed in Section 3.3. The register G stores the output produced by the adder/subtractor. 
The A and G registers are controlled by the signals Ajn, Gin, and Goyr. 

The system in Figure 7.9 can perform various functions, depending on the design of 
the control circuit. As an example, we will design a control circuit that can perform the four 
operations listed in Table 7.1. The left column in the table shows the name of an operation 
and its operands; the right column indicates the function performed in the operation. For 
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module swapmux (Resetn, Clock, w, Data, RinExt1, RinExt2, RinExt3, BusWires, Done); 
parameter n = 8; 
input Resetn, Clock, w, RinExtl, RinExt2, RinExt3; 
input [n—1:0] Data; 
output reg [n—1:0] BusWires; 
output Done; 
wire [n—1:0] R1, R2, R3; 
wire Rlin, R2in, R3in; 
reg [2:1] y, Y; 
parameter [2:1] A =2’b00, B = 2’b01, C= 2’b10, D=2’b11; 


// Define the next state combinational circuit for FSM 
always @(w, y) 
case (y) 

A: if(w) Y=B 

else Y=A 

B: Y=C; 

C: Y=D 

D: Y=A 
endcase 


// Define the sequential block for FSM 
always @(negedge Resetn, posedge Clock) 
if (Resetn ==0) y<=A; 
else y<=Y; 


// Define control signals 

assign R3in = (y == B); 
assign R2in = (y ==C); 
assign Rlin = (y == D); 
assign Done = (y == D); 


// Instantiate registers 

regn reg_! (BusWires, RinExt1 | Rlin, Clock, R1); 
regn reg_2 (BusWires, RinExt2 | R2in, Clock, R2); 
regn reg_3 (BusWires, RinExt3 | R3in, Clock, R3); 


// Define the multiplexers 
always @(y, Data, R1, R2, R3) 
if (y == A) BusWires = Data; 
else if (y == B) BusWires = R2; 
else if (y == C) BusWires = R1; 
else BusWires = R3; 
endmodule 


Figure 7.8 — Using multiplexers to implement the bus structure. 
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Table 7.1 Operations performed 
in the processor. 


Operation Function performed 
Load Rx, Data Rx <— Data 
Move Rx, Ry Rx < [Ry] 

Add Rx, Ry Rx <— [Rx] + [Ry] 
Sub Rx, Ry Rx <— [Rx] — [Ry] 


the Load operation the meaning of Rx < Data is that the data on the external Data input 
is transferred across the bus into any register, Rx, where Rx can be RO to R3. The Move 
operation copies the data stored in register Ry into register Rx. In the table the square 
brackets, as in [Rx], refer to the contents of a register. Since only a single transfer across 
the bus is needed, both the Load and Move operations require only one step (clock cycle) to 
be completed. The Add and Sub operations require three steps, as follows: In the first step 
the contents of Rx are transferred across the bus into register A. Then in the next step, the 
contents of Ry are placed onto the bus. The adder/subtractor module performs the required 
function, and the results are stored in register G. Finally, in the third step the contents of G 
are transferred into Rx. 

A digital system that performs the types of operations listed in Table 7.1 is usually 
called a processor. The specific operation to be performed at any given time is indicated 
using the control circuit input named Function. The operation is initiated by setting the w 
input to 1, and the control circuit asserts the Done output when the operation is completed. 

In Figure 7.2 we used an FSM to implement the control circuit. It is possible to use a 
similar design for the system in Figure 7.9. To illustrate another approach, we will base the 
design of the control circuit on a counter. This circuit has to generate the required control 
signals in each step of each operation. Since the longest operations (Add and Sub) need 
three steps (clock cycles), a two-bit counter can be used. Figure 7.10 shows a two-bit up- 
counter connected to a 2-to-4 decoder. Decoders are discussed in Chapter 4. The decoder 
is enabled at all times by setting its enable (En) input permanently to the value 1. Each of 
the decoder outputs represents a step in an operation. When no operation is currently being 
performed, the count value is 00; hence the 7p output of the decoder is asserted. In the first 
step of an operation, the count value is 01, and 7; is asserted. During the second and third 
steps of the Add and Sub operations, T> and T3 are asserted, respectively. 

In each of steps Ty to 73, various control signal values have to be generated by the 
control circuit, depending on the operation being performed. Figure 7.11 shows that the 
operation is specified with six bits, which form the Function input. The two left-most bits, 
F =f fo, are used as a two-bit number that identifies the operation. To represent Load, 
Move, Add, and Sub, we use the codes f; fp = 00, 01, 10, and 11, respectively. The inputs 
Rx,Rxo are a binary number that identifies the Rx operand, while Ry; Ryo identifies the Ry 
operand. The Function inputs are stored in a six-bit Function Register when the F'R;,, signal 
is asserted. 
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Figure 7.10 — A part of the control circuit for the processor. 
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Figure 7.11 The function register and decoders. 
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Figure 7.11 also shows three 2-to-4 decoders that are used to decode the information 
encoded in the F, Rx, and Ry inputs. We will see shortly that these decoders are included 
as a convenience because their outputs provide simple-looking logic expressions for the 
various control signals. 

The circuits in Figures 7.10 and 7.11 form a part of the control circuit. Using the input 
w and the signals To,..., 73, Jo, ..., 13, Xo,---, X3, and Yo, ..., Y3, we will show how to 
derive the rest of the control circuit. It has to generate the outputs Extern, Done, Ain, Gin, 
Gout, AddSub, ROjin, ..., R3in, and ROow, ..., R3ou- The control circuit also has to generate 
the Clear and FR;, signals used in Figures 7.10 and 7.11. 

Clear and F'R;, are defined in the same way for all operations. Clear is used to ensure 
that the count value remains at 00 as long as w = 0 and no operation is being executed. Also, 
it is used to clear the count value to 00 at the end of each operation. Hence an appropriate 
logic expression is 


Clear = w Ty + Done 


The F'R;, signal is used to load the values on the Function inputs into the Function Register 
when w changes to 1. Hence 


FRin = WT 


The rest of the outputs from the control circuit depend on the specific step being performed 
in each operation. The values that have to be generated for each signal are shown in 
Table 7.2. Each row in the table corresponds to a specific operation, and each column 
represents one time step. The Extern signal is asserted only in the first step of the Load 
operation. Therefore, the logic expression that implements this signal is 


Extern = IpT, 


Done is asserted in the first step of Load and Move, as well as in the third step of Add and 
Sub. Hence 


Done = (Ip +1))T, + (64+ B)T3 


Table 7.2 — Control signals asserted in each operation/time step. 


T| To T3 
(Load): Io Extern, Rin = X, 
Done 
(Move): 1; Rin =X, Row = Y, 
Done 
(Add): 0) Rout = xX, Av Rout = Ys Gre (Grartin Rn = XxX, 
AddSub = 0 Done 
(Sub): rE} Rout =, Ain Rout = Y, Gia Gout, Rin =X, 
AddSub = 1 Done 
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The Aj,, Gin, and G,,, signals are asserted in the Add and Sub operations. Aj, is asserted in 
step T;, Gin is asserted in T, and Go; is asserted in T3. The AddSub signal has to be set to 
0 in the Add operation and to 1 in the Sub operation. This is achieved with the following 
logic expressions 


Ain = (bh +h)T| 
Gin = Un + B)To 
Gout = (hh + L)T3 


AddSub = I, 
The values of ROj,,...,R3ji, are determined using either the Xo,..., X3 signals or the 
Yo,..., ¥3 signals. In Table 7.2 these actions are indicated by writing either R;, = X or 


Rin = Y. The meaning of Rj, = X is that RO;, = Xo, Rlin = X1, and so on. Similarly, the 
values of ROgw,..., R3ou are specified using either Ro,,; = X or Row = Y. 

We will develop the expressions for RO;,, and RO,,,; by examining Table 7.2 and then 
show how to derive the expressions for the other register control signals. The table shows 
that RO;, is set to the value of Xo in the first step of both the Load and Move operations and 
in the third step of both the Add and Sub operations, which leads to the expression 


ROn = Uo + )T Xo + Up + 5)T3X0 


Similarly, RO,,,; is set to the value of Yo in the first step of Move. It is set to Xo in the first 
step of Add and Sub and to Yo in the second step of these operations, which gives 


ROow = LT Yo + Go + B)(TX0 + TrYo) 


The expressions for R1;, and R1,,; are the same as those for ROj;, and ROj + except that X; 
and Y, are used in place of Xp and Yo. The expressions for R2j,, R264, R3in, and R34, are 
derived in the same way. 

The circuits shown in Figures 7.10 and 7.11, combined with the circuits represented 
by the above expressions, implement the control circuit in Figure 7.9. 

Processors are extremely useful circuits that are widely used. We have presented only 
the most basic aspects of processor design. However, the techniques presented can be 
extended to design realistic processors, such as modern microprocessors. The interested 
reader can refer to books on computer organization for more details on processor design 
[1-2]. 


Verilog Code 

In this section we give two different styles of Verilog code for describing the system 
in Figure 7.9. The first style uses tri-state drivers to represent the bus, and it gives the logic 
expressions shown above for the outputs of the control circuit. The second style of code 
uses multiplexers to represent the bus, and it uses case statements that correspond to Table 
7.2 to describe the outputs of the control circuit. 

Verilog code for a two-bit up-counter, named upcount, is shown in Figure 7.12. It has 
a synchronous reset input, which is active high. Other subcircuits that we use in the Verilog 
code for the processor are the dec2to4, regn, and trin modules in Figures 4.31, 7.5, and 7.6. 


435 


436 


CHAPTER 7 ° DIGITAL SYSTEM DESIGN 


module upcount (Clear, Clock, Q); 
input Clear, Clock; 
output reg [1:0] Q; 


always @(posedge Clock) 


if (Clear) 
Q<=0; 
else 
Q<=Q+]1; 
endmodule 


Figure 7.12 — A two-bit up-counter with synchronous reset. 


Complete code for the processor is given in Figure 7.13. The instantiated modules 
counter and decT represent the subcircuits in Figure 7.10. Note that we have assumed that 
the circuit has an active-high reset input, Reset, which is used to initialize the counter to 
00. The statement assign Func = {F, Rx, Ry} uses the concatenate operator to create the 
six-bit signal Func, which represents the inputs to the Function Register in Figure 7.11. 
The functionreg module represents the Function Register with the data inputs Func and the 
outputs FuncReg. The instantiated modules decl, decX, and decY represent the decoders in 
Figure 7.11. Following these statements the previously derived logic expressions for the 
outputs of the control circuit are given. For ROin, ..., R3in and ROow, .--; R3our, a for loop 
is used to produce the expressions. 

At the end of the code, the adder/subtractor module is defined and the tri-state drivers 
and registers in the processor are instantiated. 


Using Multiplexers and Case Statements 


We showed in Figure 7.4 that a bus can be implemented with multiplexers, rather than 
tri-state drivers. Verilog code that describes the processor using this approach is shown 
in Figure 7.14. The code illustrates a different way of describing the control circuit in the 
processor. It does not give logic expressions for the signals Extern, Done, and so on, as 
in Figure 7.13. Instead, case statements are used to represent the information shown in 
Table 7.2. Each control signal is first assigned the value 0 as a default. This is required 
because the case statements specify the values of the control signals only when they should 
be asserted, as we did in Table 7.2. As explained in Chapter 5, when the value of a 
signal is not specified, the signal retains its current value. This implied memory results 
in a feedback connection, representing a latch, in the synthesized circuit. We avoid this 
problem by providing the default value of 0 for each of the control signals involved in the 
case statements. 

In Figure 7.13 the decoders decT and decl are used to decode the Count signal and the 
stored values of the F input, respectively. The decT decoder has the outputs Tp, ..., T3, 
and decI produces Ip, ..., 13. In Figure 7.14 these two decoders are not used, because they 
do not serve a useful purpose in this code. Instead, the signal J is defined as a two-bit 
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module proc (Data, Reset, w, Clock, F, Rx, Ry, Done, Bus Wires); 
input [7:0] Data; 
input Reset, w, Clock; 
input [1:0] F, Rx, Ry; 
output wire [7:0] BusWires; 
output Done; 
reg [0:3] Rin, Rout; 
reg [7:0] Sum; 
wire Clear, AddSub, Extern, Ain, Gin, Gout, FRin; 
wire [1:0] Count; 
wire [0:3] T, I, Xreg, Y; 
wire [7:0] RO, R1, R2, R3, A, G; 
wire [1:6] Func, FuncReg; 
integer k; 


upcount counter (Clear, Clock, Count); 
dec2to4 decT (Count, 1’b1, T); 


assign Clear = Reset | Done | (~w & T[0]); 
assign Func = {F, Rx, Ry}; 
assign FRin = w & T[0]; 


regn functionreg (Func, FRin, Clock, FuncReg); 
defparam functionreg.n = 6; 

dec2to4 decI (FuncReg[1:2], 1’b1, D; 

dec2to4 decX (FuncReg[3:4], 1’b1, Xreg); 

dec2to4 decY (FuncReg[5:6], 1’b1, Y); 


assign Extern = I[0] & T[1]; 

assign Done = ((I[0] | I[1]) & T[1]) | (A[2] | 1[3]) & T[3]); 
assign Ain = (I[2] | I[3]) & T[1]; 

assign Gin = (I[2] | I[3]) & T[2]; 

assign Gout = (J[2] | I[3]) & T[3]; 

assign AddSub = I[3]; 


... continued in Part b. 


Figure 7.13 = Code for the processor (Part a). 


signal, and the two-bit signal Count is used instead of T. These signals are used in the case 
statements. The code sets J to the value of the two left-most bits in the Function Register, 
which correspond to the stored values of the input F’. 

There are two nested levels of case statements. The first one enumerates the possible 
values of Count. For each alternative in this case statement, which represents a column 
in Table 7.2, there is a nested case statement that enumerates the four values of J. As 
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// RegCntl 
always @(I, T, Xreg, Y) 
for (k =0;k < 4;k=k+1) 
begin 
Rin[k] = ((1[O] | I[1]) & T[1] & Xreg[k}) | 
([2] | 1[3]) & T[3] & Xreg[k]); 
Rout[k] = (I[1] & TI1] & Y{k]) | ([2]| I[3]) & 
((TL1] & Xreg[k]) | (T[2] & Y[k}))); 
end 


trin tri_ext (Data, Extern, BusWires); 

regn reg_0 (BusWires, Rin[0], Clock, RO); 
regn reg_! (BusWires, Rin[1], Clock, R1); 
regn reg_2 (BusWires, Rin[2], Clock, R2); 
regn reg_3 (BusWires, Rin[3], Clock, R3); 


trin tri_O (RO, Rout[0], BusWires); 
trin tri_l (R1, Rout[1], BusWires); 
trin tri_2 (R2, Rout[2], BusWires); 
trin tri_3 (R3, Rout[3], BusWires); 
regn reg_A (BusWires, Ain, Clock, A); 


// alu 
always @(AddSub, A, BusWires) 
if (!AddSub) 
Sum = A + BusWires; 
else 
Sum = A — BusWires; 


regn reg_G (Sum, Gin, Clock, G); 
trin tri_G (G, Gout, BusWires); 


endmodule 


Figure 7.13 = Code for the processor (Part b). 


indicated by the comments in the code, the nested case statements correspond exactly to 
the information given in Table 7.2. 

At the end of Figure 7.14, the bus structure is described with an if-else statement which 
represents multiplexers that place the appropriate data onto BusWires, depending on the 
values of Row, Gour, and Extern. 

We synthesized a circuit to implement the code in Figure 7.14 in a chip. Figure 7.15 
gives an example of the results of a timing simulation. Each clock cycle in which w = | in 
this timing diagram indicates the start of an operation. In the first such operation, at 250 ns 
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module proc (Data, Reset, w, Clock, F, Rx, Ry, Done, BusWires); 
input [7:0] Data; 
input Reset, w, Clock; 
input [1:0] F, Rx, Ry; 
output reg [7:0] BusWires; 
output reg Done; 
reg [7:0] Sum; 
reg [0:3] Rin, Rout; 
reg Extern, Ain, Gin, Gout, AddSub; 
wire [1:0] Count, I; 
wire [0:3] Xreg, Y; 
wire [7:0] RO, RI, R2, R3, A, G; 
wire [1:6] Func, FuncReg, Sel; 


wire Clear = Reset | Done | (~w & ~Count[1] & ~Count[0]); 
upcount counter (Clear, Clock, Count); 
assign Func = {F, Rx, Ry}; 
wire FRin = w & ~Count[1] & ~ Count[0]; 
regn functionreg (Func, FRin, Clock, FuncReg); 
defparam functionreg.n = 6; 
assign I = FuncReg[1:2]; 
dec2to4 decX (FuncReg[3:4], 1’b1, Xreg); 
dec2to4 decY (FuncReg[5:6], 1’b1, Y); 


always @(Count, I, Xreg, Y) 
begin 
Extern = 1’b0; Done = 1’b0; Ain = 1’b0; Gin = 1’b0; 
Gout = 1’b0; AddSub = 1’b0; Rin = 4’b0; Rout = 4’b0; 
case (Count) 
2°b00: ; //no signals asserted in time step TO 
2°b01: //define signals in time step T1 
case (I) 
2’b00: begin //Load 
Extern = 1’b1; Rin = Xreg; Done = 1’b1; 
end 
2’b01: begin //Move 
Rout = Y; Rin = Xreg; Done = 1’b1; 
end 
default: begin //Add, Sub 
Rout = Xreg; Ain = 1’b1; 
end 
endcase 
. continued in Part b. 


Figure 7.14 — Alternative code for the processor (Part a). 
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2’°b10: //define signals in time step T2 
case(I) 
2°b10: begin //Add 
Rout = Y; Gin = 1’b1; 
end 
2’b11: begin //Sub 
Rout = Y; AddSub = 1’b1; Gin = 1’b1; 


end 
default: ; //Add, Sub 
endcase 
Wobii: 
case (I) 


2’b10, 2’b11: begin 
Gout = 1’b1; Rin = Xreg; Done = 1’b1; 
end 
default: ; //Add, Sub 
endcase 
endcase 
end 


regn reg_0 (BusWires, Rin[0], Clock, RO); 
regn reg_1 (BusWires, Rin[1], Clock, R1); 
regn reg_2 (BusWires, Rin[2], Clock, R2); 
regn reg_3 (BusWires, Rin[3], Clock, R3); 
regn reg_A (BusWires, Ain, Clock, A); 


. continued in Part c. 


Figure 7.14 — Alternative code for the processor (Part b). 


in the simulation time, the values of both inputs F and Rx are 00. Hence the operation 
corresponds to “Load RO, Data.” The value of Data is 2A, which is loaded into RO on the 
next positive clock edge. The next operation loads 55 into register R1, and the subsequent 
operation loads 22 into R2. At 850 ns the value of the input F is 10, while Rx = O1 and 
Ry = 00. This operation is “Add R1, RO.” In the following clock cycle, the contents of 
R1 (55) appear on the bus. This data is loaded into register A by the clock edge at 950 ns, 
which also results in the contents of RO (2A) being placed on the bus. The adder/subtractor 
module generates the correct sum (7F), which is loaded into register G at 1050 ns. After 
this clock edge the new contents of G (7F) are placed on the bus and loaded into register 
R1 at 1150 ns. Two more operations are shown in the timing diagram. The one at 1250 
ns (“Move R3, R1’) copies the contents of R1 (7F) into R3. Finally, the operation starting 
at 1450 ns (“Sub R3, R2”) subtracts the contents of R2 (22) from the contents of R3 (7F), 
producing the correct result, 7F — 22 = 5D. 
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// alu 
always @(AddSub, A, Bus Wires) 
begin 
if (!AddSub) 
Sum = A + BusWires; 
else 
Sum = A — BusWires; 
end 


regn reg_G (Sum, Gin, Clock, G); 
assign Sel = {Rout, Gout, Extern}; 


always @(Sel, RO, R1, R2, R3, G Data) 
begin 
if (Sel == 6’b100000) 
BusWires = RO; 
else if (Sel == 6’b010000) 
BusWires = R1; 
else if (Sel == 6’b001000) 
BusWires = R2; 
else if (Sel == 6’b000100) 
BusWires = R3; 
else if (Sel == 6’b000010) 


BusWires = G; 
else BusWires = Data; 
end 
endmodule 


Figure 7.14 — Alternative code for the processor (Part c). 
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We introduced algorithmic state machine (ASM) charts in Section 6.10 and showed how 
they can be used to describe finite state machines. ASM charts can also be used to describe 
digital systems that include both datapath and control circuits. We will illustrate how the 
ASM charts can be used as an aid in designing digital systems by using them in the remaining 
examples in this chapter. 

Suppose that we wish to count the number of bits in a register, A, that have the value 1. 
Figure 7.16 shows pseudo-code for a step-by-step procedure, or algorithm, that can be 
used to perform the required task. It assumes that A is stored in a register that can shift its 
contents in the left-to-right direction. The answer produced by the algorithm is stored in 
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250.0ns 500.0ns 750.0ns 1.Ous 1.25us 1.5us 1.75us 


Figure 7.15 Timing simulation for the Verilog code in Figure 7.14. 


B=0; 
while A 40 do 
if ag = 1 then 
B=B+1,; 
end if; 
Right-shift A; 
end while; 


Figure 7.16 — Pseudo-code for the bit counter. 


the variable named B. The algorithm terminates when A does not contain any more |s, that 
is when A = 0. In each iteration of the while loop, if the least-significant bit (LSB) of A is 
1, then B is incremented by 1; otherwise, B is not changed. The contents of A are shifted 
one bit to the right at the end of each loop iteration. 

Figure 7.17 gives an ASM chart that represents the algorithm in Figure 7.16. The state 
box for the starting state, S1, specifies that B is initialized to 0. We assume that an input 
signal, s, exists, which is used to indicate when the data to be processed has been loaded 
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Reset 


<2 <7 


1 


Figure 7.17 = ASM chart for the pseudo-code in Figure 7.16. 


into A, so that the machine can start. The decision box labeled s stipulates that the machine 
remains in state $1 as long as s = 0. The conditional output box with Load A written inside 
it indicates that A is loaded from external data inputs if s = 0 in state $1. 

When s becomes |, the machine changes to state $2. The decision box below the state 
box for $2 checks whether A = 0. If so, the bit-counting operation is complete; hence the 
machine should change to state S3. If not, the FSM remains in state $2. The decision 
box at the bottom of the chart checks the value of dg. If a9 = 1, B is incremented, which 
is indicated in the chart as B << B+ 1. If a9 = 0, then B is not changed. In state $3, B 
contains the result, which is the number of bits in A that were 1. An output signal, Done, is 
set to | to indicate that the algorithm is finished; the FSM stays in $3 until s goes back to 0. 


ASM Chart Implied Timing Information 

In Chapter 6 we said that ASM charts are similar to traditional flowcharts, except that 
the ASM chart implies timing information. We can use the bit-counting example to illustrate 
this concept. Consider the ASM block for state $2, which is shaded in blue in Figure 7.17. 
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In a traditional flowchart, when state $2 is entered, the value of A would first be shifted to 
the right. Then we would examine the value of A and if A’s LSB is 1, we would immediately 
add | to B. But, since the ASM chart represents a sequential circuit, changes in A and B, 
which represent the outputs of flip-flops, take place after the active clock edge. The same 
clock signal that controls changes in the state of the machine also controls changes in A 
and B. When the machine is in state $1, the next active clock edge will only perform the 
action specified inside the state box for $1, whichis B < 0. Hence in state $2, the decision 
box that tests whether A = 0, as well as the box that checks the value of ag, check the bits 
in A before they are shifted. If A = 0, then the FSM will change to state $3 on the next 
clock edge (this clock edge also shifts A, which has no effect because A is already 0 in this 
case.) On the other hand, if A 4 0, then the FSM does not change to $3, but remains in 
S2. At the same time, A is still shifted, and B is incremented if ao has the value 1. These 
timing issues are illustrated in Figure 7.21, which represents a simulation result for a circuit 
that implements the ASM chart. We show how the circuit is designed in the following 
discussion. 


Datapath Circuit 


By examining the ASM chart for the bit-counting circuit, we can infer the type of circuit 
elements needed to implement its datapath. We need a shift register that shifts left-to-right 
to implement A. It must have the parallel-load capability because of the conditional output 
box in state $1 that loads data into the register. An enable input is also required because 
shifting should occur only in state $2. A counter is needed for B, and it needs a parallel-load 
capability to initialize the count to 0 in state S1. It is not wise to rely on the counter’s reset 
input to clear B to 0 in state $1. In practice, the reset signal is used in a digital system for 
only two purposes: to initialize the circuit when power is first applied, or to recover from 
an error. The machine changes from state $3 to S1 as a result of s = 0; hence we should 
not assume that the reset signal is used to clear the counter. 

The datapath circuit is depicted in Figure 7.18. The serial input to the shift register, w, 
is connected to 0, because it is not needed. The load and enable inputs on the shift register 
are driven by the signals LA and EA. The parallel input to the shift register is named Data, 
and its parallel output is A. An n-input NOR gate is used to test whether A = 0. The output 
of this gate, z, is 1 when A = 0. Note that the figure indicates the n-input NOR gate by 
showing a single input connection to the gate, with the label n attached to it. The counter 
has /ogs(n) bits, with parallel inputs connected to 0 and parallel outputs named B. It also 
has a parallel-load input LB and enable input EB control signals. 


Control Circuit 

For convenience we can draw a second ASM chart that represents only the FSM needed 
for the control circuit, as shown in Figure 7.19. The FSM has the inputs s, ao, and z and 
generates the outputs EA, LB, EB, and Done. In state $1, LB is asserted, so that 0 is loaded 
in parallel into the counter. Note that for the control signals, like LB, instead of writing LB 
= 1, we simply write LB to indicate that the signal is asserted. We assume that external 
circuitry drives LA to | when valid data is present at the parallel inputs of the shift register, 
so that the shift register contents are initialized before s changes to |. In state $2, EA is 
asserted to cause a shift operation, and the count enable for B is asserted only if ay = 1. 
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Counter 


Figure 7.18 — Datapath for the ASM chart in Figure 7.17. 


Verilog Code 


The bit-counting circuit can be described in Verilog code as shown in Figure 7.20. We 
have chosen to define A as an eight-bit vector and B as a four-bit vector signal. The ASM 
chart in Figure 7.19 can be directly translated into code that describes the required control 
circuit. The signal y is used to represent the present state of the FSM, and Y represents the 
next state. The FSM is described with three always blocks: the block labeled State_table 
specifies the state transitions, the block labeled State_flipflops represents the state flip-flops, 
and the block labeled FSM_outputs specifies the generated outputs in each state. A default 
value is specified at the beginning of the FSM_outputs block for each output signal, and 
then individual output values are specified in the case statement. For example, the default 
value of EA is 0, and EA is set to | only in state $2. Thus, A will be shifted only when the 
FSM is in state $2 and an active clock edge occurs. 

The fourth always block defines the up-counter that implements B. The shift register 
for A is instantiated at the end of the code; its specification is given in Figure 5.60, but the 
parameter value has to be n = 8. Finally, the z signal is defined using the reduction NOR 
operator. 

We implemented the code in Figure 7.20 in a chip and performed a timing simulation. 
Figure 7.21 gives the results of the simulation for A = 00111011. After the circuit is reset, 
the input signal LA is set to 1, and the desired data, (3B)\6, is placed on the Data input. 
When s changes to 1, the next active clock edge causes the FSM to change to state $2. In 
this state, each active clock edge increments B if ag is 1, and shifts A. When A = 0, the 
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Reset 


Figure 7.19 = ASMchart for the bit counter control circuit. 


next clock edge causes the FSM to change to state $3, where Done is set to | and B has the 
correct result, B = 5. To check more thoroughly that the circuit is designed correctly, we 
should try different values of input data. 


7.4 SHIFT-AND-ADD MULTIPLIER 


We presented a circuit that multiplies two unsigned n-bit binary numbers in Figure 3.35. 
The circuit uses a two-dimensional array of identical subcircuits, each of which contains a 
full-adder and an AND gate. For large values of n, this approach may not be appropriate 
because of the large number of gates needed. Another approach is to use a shift register 
in combination with an adder to implement the traditional method of multiplication that 
is done by “hand.” Figure 7.22a illustrates the manual process of multiplying two binary 
numbers. The product is formed by a series of addition operations. For each bit 7 in the 
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module bitcount (Clock, Resetn, LA, s, Data, B, Done); 
input Clock, Resetn, LA, s; 
input [7:0] Data; 
output reg [3:0] B; 
output reg Done; 
wire [7:0] A; 
wire Zz; 
reg [1:0] Y, y; 
reg EA, EB, LB; 


// control circuit 
parameter S1 = 2’b00, S2 = 2’b01, S3 = 2’b10; 


always @(s, y, Z) 
begin: State_table 


case (y) 
Sl: if(!s) Y=S1; 
else Y = S2; 
82: if (z==0) Y=S82; 
else Y = S3; 
S3: if (s) Y= S3; 
else Y = S1; 
default: Y = 2’bxx; 
endcase 
end 


always @(posedge Clock, negedge Resetn) 
begin: State_flipflops 
if (Resetn == 0) 
y <=S1; 
else 
y <=Y; 
end 


... continued in Part b. 


Figure 7.20 _ Verilog code for the bit-counting circuit (Part a). 


multiplier that is 1, we add to the product the value of the multiplicand shifted to the left 7 
times. This algorithm can be described in pseudo-code as shown in Figure 7.22b, where A 
is the multiplicand, B is the multiplier, and P is the product. 

An ASM chart that represents the algorithm in Figure 7.22b is given in Figure 7.23. 
We assume that an input s is used to control when the machine begins the multiplication 
process. As long as s is 0, the machine stays in state $1 and the data for A and B can be 
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always @(y, A[0]) 
begin: FSM_outputs 
// defaults 
EA = 0; LB = 0; EB = 0; Done = 0; 
case (y) 
Sl: LB=1; 
S2: begin 
EA=1; 
if (A[0]) EB = 1; 
else EB = 0; 
end 
S3: Done = 1; 
endcase 
end 


// datapath circuit 


// counter B 
always @(negedge Resetn, posedge Clock) 
if (!Resetn) 
B <=0; 
else if (LB) 
B <=0; 
else if (EB) 
B<=B+1; 


shiftrne ShiftA (Data, LA, EA, 1’b0, Clock, A); 
assign z= ~| A; 


endmodule 


Figure 7.20 _ Verilog code for the bit-counting circuit (Part b). 


loaded from external inputs. In state $2 we test the value of the LSB of B, and if it is 1, we 
add A to P. Otherwise, P is not changed. The machine moves to state $3 when B contains 
0, because P has the final product in this case. For each clock cycle in which the machine 
is in state $2, we shift the value of A to the left, as specified in the pseudo-code in Figure 
7.22b. We shift the contents of B to the right so that in each clock cycle bg can be used to 
decide whether or not A should be added to P. 


Datapath Circuit 
We can now define the datapath circuit. To implement A we need a right-to-left shift 


register that has 2n bits. A 2n-bit register is needed for P, and it must have an enable input 
because the assignment P < P + A in state $2 is inside a conditional output box. A 2n-bit 
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Figure 7.21 Simulation results for the bit-counting circuit. 


Decimal Binary 
13 1101  Multiplicand 
x11 x1011 Multiplier 
13 1101 
13} 1101 
143 0000 | 
11014 
10001111 Product 
(a) Manual method 
P=0; 
fori = 0 ton —1do 
if b; = 1 then 
P=P-+4A; 
end if; 
Left-shift A; 
end for; 


(b) PSeudo-code 


Figure 7.22 = Analgorithm for multiplication. 
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Reset 


Figure 7.23 = ASM chart for the multiplier. 


adder is needed to produce P + A. Note that P is loaded with 0 in state $1, and P is loaded 
from the output of the adder in state $2. We cannot assume that the reset input is used to 
clear P, because the machine changes from state $3 back to $1 based on the s input, not the 
reset input. Hence a 2-to-1 multiplexer is needed for each input to P, to select either 0 or 
the appropriate sum bit from the adder. An n-bit left-to-right shift register is needed for B, 
and an n-input NOR gate can be used to test whether B = 0. 

Figure 7.24 shows the datapath circuit and labels the control signals for the shift reg- 
isters. The input data for the shift register that holds A is named DataA. Since the shift 
register has 2n bits, the most-significant n data inputs are connected to 0. A single multi- 
plexer symbol is shown connected to the register that holds P. This symbol represents 2n 
2-to-1 multiplexers that are each controlled by the Psel signal. 


Control Circuit 

An ASM chart that represents only the control signals needed for the multiplier is given 
in Figure 7.25. In state S1, Psel is set to 0 and EP is asserted, so that register P is cleared. 
When s = 0, parallel data can be loaded into shift registers A and B by an external circuit 
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LA 0 DataA LB DataB 


Shift-left E Shift-right 
register register 


Register 


P 


Figure 7.24 — Datapath circuit for the multiplier. 


that controls their parallel-load inputs LA and LB. When s = 1, the machine changes to state 
S2, where Psel is set to 1 and shifting of A and B is enabled. If bo = 1, the enable for P 
is asserted. The machine changes to state S3 when z = 1, and then remains in $3 and sets 
Done to the value | as long as s = 1. 


Verilog Code 

Verilog code for the multiplier is given in Figure 7.26. The number of bits in A and B is 
set by the parameter n. For registers that are 2n bits wide, the number of bits is set ton +n. 
By changing the value of the parameters, the code can be used for numbers of any size. 
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Reset 


- 


Figure 7.25 = ASM chart for the multiplier control circuit. 


The always blocks labeled State_table and State_flipflops define the state transitions and 
state flip-flops, respectively. The control circuit outputs are specified in the always block 
labeled FSM_outputs. The parallel data input on the shift register A is 2n bits wide, but 
DataA is only n bits wide. Hence the concatenate operation {{n{1’b0}}, DataA} is used to 
prepend n zeros onto DataA for loading into the shift register. The multiplexer needed for 
register P is defined using a for loop that defines 2n 2-to-1 multiplexers. Figure 7.27 gives 
a simulation result for the circuit generated from the code. After the circuit is reset, LA and 
LB are set to 1, and the numbers to be multiplied are placed on the DataA and DataB inputs. 
After s is set to 1, the FSM (y) changes to state $2, where it remains until B = 0. For each 
clock cycle in state $2, A is shifted to the left, and B is shifted to the right. In three of the 
clock cycles in state S2, the contents of A are added to P, corresponding to the three bits 
in B that have the value 1. When B = 0, the FSM changes to state S3 and P contains the 
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module multiply (Clock, Resetn, LA, LB, s, DataA, DataB, P, Done); 
parameter n= 8; 
input Clock, Resetn, LA, LB, s; 
input [n—1:0] DataA, DataB; 
output [n+n—1:0] P; 
output reg Done; 
wire Zz; 
reg [n+n—1:0] DataP; 
wire [n+n—1:0] A, Sum; 
reg [1:0] y, Y; 
wire [n—1:0] B; 
reg EA, EB, EP, Psel; 
integer k; 


// control circuit 
parameter S1 = 2’b00, S2 = 2’b01, S3 = 2’b10; 


always @(s, y, Z) 
begin: State_table 


case (y) 
Sl: if(s==0) Y=SI1; 
else Y = S2; 
82: if (z==0) Y=S82; 
else Y = S3; 
S3: if (s == 1) Y=S3; 
else Y = S1; 
default: Y = 2’bxx; 
endcase 
end 


always @(posedge Clock, negedge Resetn) 
begin: State_flipflops 
if (Resetn == 0) 
y <=S1; 
else 
y <=Y; 
end 


... continued in Part b. 


Figure 7.26 Verilog code for the multiplier circuit (Part a). 
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always @(s, y, B[0]) 
begin: FSM_outputs 
// defaults 
EA = 0; EB = 0; EP = 0; Done = 0; Psel = 0; 
case (y) 
Sl: EP=1; 
S2: begin 
EA = 1; EB = 1; Psel = 1; 
if (B[0O]) EP= 1; 
else EP = 0; 
end 
$3: Done = 1; 
endcase 
end 


//datapath circuit 


shiftrne ShiftB (DataB, LB, EB, 1’b0, Clock, B); 
defparam ShiftB.n = 8; 

shiftIne ShiftA ({ {n{1’b0}}, DataA}, LA, EA, 1’b0, Clock, A); 
defparam ShiftA.n = 16; 


assign z=(B==0); 
assign Sum =A+P; 


// define the 2n 2-to-1 multiplexers 
always @(Psel, Sum) 
for (k = 0; k < n+n; k = k+1) 
DataP[k] = Psel ? Sum[k] : 1’b0; 


regne RegP (DataP, Clock, Resetn, EP, P); 
defparam RegP.n = 16; 


endmodule 


Figure 7.26 _ Verilog code for the multiplier circuit (Part b). 


correct product, which is (64)16 x (19)i6 = (9C4)16. The decimal equivalent of this result 
is 100 x 25 = 2500. 

The number of clock cycles that the circuit requires to generate the final product is 
determined by the left-most digit in B that is 1. It is possible to reduce the number of clock 
cycles needed by using more complex shift registers for A and B. If the two right-most bits 
in B are both 0, then both A and B could be shifted by two bit positions in one clock cycle. 
Similarly, if the three lowest digits in B are 0, then a three bit-position shift can be done, 
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Figure 7.27 — Simulation results for the multiplier circuit. 


and so on. A shift register that can shift by multiple bit positions at once can be built using 
a barrel shifter. We leave it as an exercise for the reader to modify the multiplier to make 
use of a barrel shifter. 
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7.5 DIVIDER 


The preceding example implements the traditional method of performing multiplication by 
hand. In this example we will design a circuit that implements the traditional long-hand 
division. Figure 7.28a gives an example of long-hand division. The first step is to try to 
divide the divisor 9 into the first digit of the dividend 1, which does not work. Next, we 
try to divide 9 into 14, and determine that | is the first digit in the quotient. We perform 
the subtraction 14 — 9 = 5, bring down the last digit from the dividend to form 50, and 
then determine that the next digit in the quotient is 5. The remainder is 50 — 45 = 5, and 
the quotient is 15. Using binary numbers, as illustrated in Figure 7.28), involves the same 
process, with the simplification that each digit of the quotient can be only 0 or I. 

Given two unsigned n-bit numbers A and B, we wish to design a circuit that produces 
two n-bit outputs Q and R, where Q is the quotient A/B and R is the remainder. The 
procedure illustrated in Figure 7.28 can be implemented by shifting the digits in A to the 
left, one digit at a time, into a shift register R. After each shift operation, we compare R with 
B. If R = B, a 1 is placed in the appropriate bit position in the quotient and B is subtracted 
from R. Otherwise, a 0 bit is placed in the quotient. This algorithm is described using 
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9) 140 B— 1001)10001100 «—A 
as 1001| 
50 10001 
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B) 10000 
1001 
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101 <«—R 
(a) An example using decimal numbers (b) Using binary numbers 
R =0; 


fori = 0 ton — 1 do 
Left-shift R|| A; 


if R > B then 
gi =1; 
R=R-B; 
else 
Gi = 0; 
end if; 
end for; 


(c) Pseudo-code 


Figure 7.28 —= Analgorithm for division. 


pseudo-code in Figure 7.28c. The notation R||A is used to represent a 2n-bit shift register 
formed using R as the left-most 7 bits and A as the right-most n bits. 

In the long-division pseudo-code, each loop iteration results in setting a digit g; to either 
1 or 0. A straightforward way to accomplish this is to shift 1 or 0 into the least-significant 
bit of Q in each loop iteration. An ASM chart that represents the divider circuit is shown 
in Figure 7.29. The signal C represents a counter that is initialized to n — 1 in the starting 
state $1. In state S2, both R and A are shifted to the left, and then in state $3, B is subtracted 
from R if R > B. The machine changes to state $4 when C = 0. 


Datapath Circuit 

We need n-bit shift registers that shift right to left for A, R, and Q. An n-bit register is 
needed for B, and a subtractor is needed to produce R — B. We can use an adder module in 
which the carry-in is set to 1 and B is complemented. The carry-out, c,,;, of this module 
has the value | if the condition R > B is true. Hence the carry-out can be connected to the 
serial input of the shift register that holds Q, so that it is shifted into Q in state $3. Since 
R is loaded with 0 in state $1 and from the outputs of the adder in state $3, a multiplexer 
is needed for the parallel data inputs on R. The datapath circuit is depicted in Figure 7.30. 
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Figure 7.29 ASM chart for the divider. 


Note that the down-counter needed to implement C and the NOR gate that outputs a 1 when 
C = 0 are not shown in the figure. 


Control Circuit 

An ASM chart that shows only the control signals needed for the divider is given in 
Figure 7.31. In state $3 the value of c,,,; determines whether or not the sum output of the 
adder is loaded into R. The shift enable on Q is asserted in state $3. We do not have to 
specify whether | or 0 is loaded into Q, because c,,; is connected to Q’s serial input in 
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Figure 7.30 — Datapath circuit for the divider. 


the datapath circuit. We leave it as an exercise for the reader to write Verilog code that 
represents the ASM chart in Figure 7.31 and the datapath circuit in Figure 7.30. 


Enhancements to the Divider Circuit 

Using the ASM chart in Figure 7.31 causes the circuit to loop through states $2 and 
S3 for 2n clock cycles. If these states can be merged into a single state, then the number of 
clock cycles needed can be reduced to n. In state $3, if co, = 1, we load the sum output 
(result of the subtraction) from the adder into R, and (assuming z = 0) change to state $2. 
In state $2 we then shift R (and A) to the left. To combine $2 and S3 into a new state, called 
S2, we need to be able to place the sum into the left-most bits of R while at the same time 
shifting the MSB of A into the LSB of R. This step can be accomplished by using a separate 
flip-flop for the LSB of R. Let the output of this flip-flop be called rrp. It is initialized to 0 
when s = 0 in state $1. Otherwise, the flip-flop is loaded from the MSB of A. In state $2, 
if Cow = 0, R is shifted left and rro is shifted into R. But if co, = 1, R is loaded in parallel 
from the sum outputs of the adder. 

Figure 7.32 illustrates how the division example from Figure 7.285 can be performed 
using n clock cycles. The table in the figure shows the values of R, rro, A, and Q in each step 
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Rsel = 0, LR, LC 


Figure 7.31 = ASM chart for the divider control circuit. 


of the division. In the datapath circuit in Figure 7.30, we use a separate shift register for Q. 
This register is not actually needed, because the digits in the quotient can be shifted into the 
least-significant bit of the register used for A. In Figure 7.32 the digits of Q that are shifted 
into A are shown in blue. The first row in the table represents loading of initial data into 
registers A (and B) and clearing R and rro to 0. In the second row of the table, labeled clock 
cycle 0, the diagonal blue arrow shows that the left-most bit of A (1) is shifted into rro. The 
number in R||r79 is now 000000001, which is smaller than B (1001). In clock cycle 1, rro is 
shifted into R, and the MSB of A is shifted into rrp. Also, as shown in blue, a 0 is shifted into 
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Clock cycle R ITo A/Q 
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0 Shift left O00 O00 00/14 £00 0.11000 
1 Shift left, Q, <0 TOO OHNO 1) Oe ero lt oe 00% 
2 Shiftleft, Q<0 00000010 TT LOO URS 
3 Shiftleft, Q-0 00000100 ee 
4 Shift left, Q, <0 00 00 F000) i oto) eae 
5 Subtract, Qy < 1 OE eT ee eee 
6 Subtract, Ql 00001000 TOP euE Ts 
7 Subtract, Qe 1 00000111)/0 00000111 
8 Subtract, Qe! 00000101)/0 00001111 


Figure 7.32 = An example of division using n = 8 clock cycles. 


the LSB of Q (A). The number in R||rrg is now 000000010, which is still smaller than B. 
Hence, in clock cycle 2 the same actions are performed as for clock cycle 1. These actions 
are also performed in clock cycles 3 and 4, at which point R||r79 = 000010001. Since this is 
larger than B, in clock cycle 5 the result of the subtraction 000010001 — 1001 = 00001000 
is loaded into R. The MSB of A (1) is still shifted into rrg, and a | is shifted into Q. In clock 
cycles 6, 7, and 8, the number in R||rro is larger than B; hence in each of these cycles the 
result of the subtraction R||rro — B is loaded into R, and a 1 is loaded into Q. After clock 
cycle 8 the correct result, Q = 00001111 and R = 00000101, is obtained. The bit rrp is not 
a part of the final result. 

An ASM chart that shows the values of the required control signals for the enhanced 
divider is depicted in Figure 7.33. The signal ERO is used in conjunction with the flip-flop 
that has the output 779. When ERO = 0, the value 0 is loaded into the flip-flop. When ERO 
is set to 1, the MSB of shift register A is loaded into the flip-flop. In state $1, if s = 0, then 
LR is asserted to initialize R to 0. Registers A and B can be loaded with data from external 
inputs. When s changes to 1, the machine makes a transition to state $2 and at the same 
time shifts R||RO||A to the left. In state $2, if co,, = 1, then R is loaded in parallel from 
the sum outputs of the adder. At the same time, RO||A is shifted left (779 is not shifted into 
R in this case). If coy; = 0, then R||RO||A is shifted left. The ASM chart shows how the 
parallel-load and enable inputs on the registers have to be controlled to achieve the desired 
operation. 

The datapath circuit for the enhanced divider is illustrated in Figure 7.34. As discussed 
for Figure 7.32, the digits of the quotient Q are shifted into register A. Note that one of 
the n-bit data inputs on the adder module is composed of the n — | least-significant bits in 
register R concatenated with bit 77o on the right. 
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Figure 7.33 = ASM chart for the enhanced divider control circuit. 


Verilog Code 

Figure 7.35 shows Verilog code that represents the enhanced divider. The parameter n 
sets the number of bits in the operands. The State_table, State_flipflops, and FSM_outputs 
always blocks describe the control circuit, as in the previous examples. The shift registers 
and counters in the datapath circuit are instantiated at the bottom of the code. The signal 
rro in Figure 7.32 is represented in the code by the signal RO. This signal is implemented 
as the output of the muxdff component; the code for this subcircuit is shown in Figure 5.47. 
Note that the adder that produces the Sum signal has one input defined as the concatenation 
of R with RO. The multiplexer needed for the input to R is represented by the DataR signal. 
This multiplexer is defined in the last statement of the code. 
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Figure 7.34 —Datapath circuit for the enhanced divider. 
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module divider (Clock, Resetn, s, LA, EB, DataA, DataB, R, Q, Done); 
parameter n = 8, logn = 3; 
input Clock, Resetn, s, LA, EB; 
input [n—1:0] DataA, DataB; 
output [n—1:0] R, Q; 
output reg Done; 
wire Cout, z, RO; 
wire [n—1:0] DataR; 
wire [n:0] Sum; 
reg [1:0] y, Y; 
wire [n—1:0] A, B; 
wire [logn—1:0] Count; 
reg EA, Rsel, LR, ER, ERO, LC, EC; 
integer k; 


// control circuit 
parameter S1 = 2’b00, S2 = 2’b01, S3 = 2’b10; 


always @(s, y, Z) 
begin: State_table 
case (y) 
Sl: if(s ==0) Y=S1; 
else Y = S2; 
82: if (z==0) Y = S82; 
else Y = S3; 
S3: if (s == 1) Y = S3; 
else Y = S1; 
default: Y = 2’bxx; 
endcase 
end 


always @(posedge Clock, negedge Resetn) 
begin: State_flipflops 
if (Resetn == 0) 
y <=S1; 
else 
y <=Y; 
end 


.. continued in Part b. 


Figure 7.35 Verilog code for the divider circuit (Part a). 
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always @(y, s, Cout, z) 
begin: FSM_outputs 
// defaults 
LR = 0; ER = 0; ERO = 0; LC = 0; EC = 0; EA = 0; 
Rsel = 0; Done = 0; 
case (y) 
Sl: begin 
LC = 1; ER=1; 
if (s == 0) 
begin 
LR = 1; ERO =0; 
end 
else 
begin 
LR = 0; EA= 1; ERO=1; 
end 
end 
82: begin 
Rsel = 1; ER = 1; ERO = 1; EA=1; 
if (Cout) LR = 1; 
else LR = 0; 
if (z == 0) EC=1; 
else EC = 0; 
end 
S3: Done = 1; 
endcase 
end 


.. continued in Part c. 


Figure 7.35 Verilog code for the divider circuit (Part b). 


A simulation result for the circuit produced from the code is given in Figure 7.36. The 
data A = A6 and B = 8 is loaded, and then s is set to 1. The circuit changes to state $2 
and concurrently shifts R, RO, and A to the left. The output of the shift register that holds A 
is labeled Q in the simulation results because this shift register contains the quotient when 
the division operation is complete. On the first three active clock edges in state $2, the 
number represented by R||RO is less than the number in B (8); hence R||RO||A is shifted 
left on each clock edge, and 0 is shifted into Q. In the fourth consecutive clock cycle for 
which the FSM has been in state $2, the contents of R are 00000101 = (5)j0, and RO is 
0; hence R||RO = 000001010 = (10)19. On the next active clock edge, the output of the 
adder, which is 10 — 8 = 2, is loaded into R, and | is shifted into Q. After n clock cycles in 
state $2, the circuit changes to state $3, and the correct result, Q = 14 = (20)19 and R = 6, 
is obtained. 
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//datapath circuit 


regne RegB (DataB, Clock, Resetn, EB, B); 
defparam RegB.n = n; 

shiftIne ShiftR (DataR, LR, ER, RO, Clock, R); 
defparam ShiftR.n = n; 

muxdff FF_RO (1’b0, A[n—1], ERO, Clock, RO); 

shiftlne ShiftA (DataA, LA, EA, Cout, Clock, A); 
defparam ShiftA.n =n; 

assign Q=A; 

downcount Counter (Clock, EC, LC, Count); 
defparam Counter.n = logn; 


assign z = (Count == 0); 
assign Sum = {1’b0, R[n—2:0], RO} + {1’b0, ~B} + 1; 


assign Cout = Sum[n]; 


// define the n 2-to-1 multiplexers 
assign DataR = Rsel ? Sum: 0; 


endmodule 


Figure 7.35 Verilog code for the divider circuit (Part c). 


125.0ns 250.0ns 375.0ns 500.0ns 625.0ns 
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Figure 7.36 — Simulation results for the divider circuit. 
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7.6 ARITHMETIC MEAN 


Assume that k n-bit numbers are stored in a set of registers Ro, ..., Ry. We wish to design 
a circuit that computes the mean M of the numbers in the registers. The pseudo-code for 
a suitable algorithm is shown in Figure 7.37a. Each iteration of the loop adds the contents 
of one of the registers, denoted R;, to a Sum variable. After the sum is computed, M is 
obtained as Sum/k. We assume that integer division is used, so a remainder R, not shown 
in the code, is produced as well. 

An ASM chart is given in Figure 7.37b. While the start input, s, is 0, the registers can 
be loaded from external inputs. When s becomes 1, the machine changes to state $2, where 
it remains while C 4 0, and computes the summation (C is a counter that represents i in 
Figure 7.37a). When C = 0, the machine changes to state $3 and computes M = Sum/k. 
From the previous example, we know that the division operation requires multiple clock 
cycles, but we have chosen not to indicate this in the ASM chart. After computing the 
division operation, state $4 is entered and Done is set to 1. 


Datapath Circuit 


The datapath circuit for this task is more complex than in our previous examples. It is 
depicted in Figure 7.38. We need a register with an enable input to hold Sum. For simplicity, 
assume that the sum can be represented in n bits without overflowing. A multiplexer is 
required on the data inputs on the Sum register, to select 0 in state $1 and the sum outputs 
of an adder in state S2. The Sum register provides one of the data inputs to the adder. The 
other input has to be selected from the data outputs of one of the k registers. One way 
to select among the registers is to connect them to the data inputs of a k-to-1 multiplexer 
that is connected to the adder. The select lines on the multiplexer can be controlled by the 
counter C. To compute the division operation, we can use the divider circuit designed in 
Section 7.5. 

The circuit in Figure 7.38 is based on k = 4, but the same circuit structure can be 
used for larger values of k. Note that the enable inputs on the registers Ro through R3 are 
connected to the outputs of a 2-to-4 decoder that has the two-bit input RAdd, which stands 
for “register address.” The decoder enable input is driven by the ER signal. All registers 
are loaded from the same input lines, Data. Since k = 4, we could perform the division 
operation simply by shifting Sum two bits to the right, which can be done in one clock cycle 
with a shift register that shifts by two digits. To obtain a more general circuit that works 
for any value of k, we use the divider circuit designed in Section 7.5. 


Control Circuit 


Figure 7.39 gives an ASM chart for the FSM needed to control the circuit in Figure 7.38. 
While in state $1, data can be loaded into registers Ro, ..., Ry—1. But no control signals 
have to be asserted for this purpose, because the registers are loaded under control of the 
ER and RAdd inputs, as discussed above. When s = |, the FSM changes to state $2, where 
it asserts the enable ES on the Sum register and allows C to decrement. When the counter 
reaches 0) (z = 1), the machine enters state $3, where it asserts the LA and EB signals to 
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Sum = 0; 

fori = k — 1 downto 0 do 
Sum = Sum + R; 

end for; 

M= Sum ~ k; 


(a) Pseudo-code 


Reset 


Sl 


Sum <—0,C¢ k-1 
0 
1 


S2 
0 


1 


S3 
M<€< Sum/k 


(b) ASM chart 


Figure 7.37 — Analgorithm for finding the mean of k numbers. 
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RAdd ER 


B EB A LA 


Divider 


R Q_ Done 


Figure 7.38 — Datapath circuit for the mean operation. 
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Reset 


LC, Ssel = 0, ES 


Ssel = 1, ES 


Figure 7.39 = ASM chart for the mean operation control circuit. 


load the Sum and k into the A and B inputs of the divider circuit, respectively. The FSM then 
enters state $4 and asserts the Div signal to start the division operation. When it is finished, 
the divider circuit sets zz = 1, and the FSM moves to state S5. The mean M appears on 
the Q and R outputs of the divider circuit. The Div signal must still be asserted in state S'5 
to prevent the divider circuit from reinitializing its registers. Note that in the ASM chart in 
Figure 7.37b, only one state is shown for computing M = Sum/k, but in Figure 7.39, states 
S3 and S4 are used for this purpose. It is possible to combine states $3 and $4, which we 
will leave as an exercise for the reader (Problem 7.10). 
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7.7 SORT OPERATION 


Given a list of k unsigned n-bit numbers stored in a set of registers Ro, ..., Ry_1, we wish to 
design a circuit that can sort the list (contents of the registers) in ascending order. Pseudo- 
code for a simple sorting algorithm is shown in Figure 7.40. Itis based on finding the smallest 
number in the sublist R;, ..., Ry; and moving that number into R;, fori = 1,2,...,k — 2. 
Each iteration of the outer loop places the number in R; into A. Each iteration of the inner 
loop compares this number to the contents of another register Rj. If the number in R; is 
smaller than A, the contents of R; and R; are swapped and A is changed to hold the new 
contents of R;. 

An ASM chart that represents the sorting algorithm is shown in Figure 7.41. In the 
initial state $1, while s = 0 the registers are loaded from external data inputs and a counter 
C; that represents i in the outer loop is cleared. When the machine changes to state $2, A is 
loaded with the contents of R;. Also, C;, which represents j in the inner loop, is initialized 
to the value of 7. State S3 is used to initialize j to the value i+ 1, and state $4 loads the 
value of R; into B. In state $5, A and B are compared, and if B < A, the machine moves to 
state $6. States S56 and S7 swap the values of R; and R;. State 58 loads A from R;. Although 
this step is necessary only for the case where B < A, the flow of control is simpler if this 
operation is performed in both cases. If C; is not equal to k — 1, the machine changes from 
S8 to S4, thus remaining in the inner loop. If C; = k — 1 and C; is not equal to k — 2, then 
the machine stays in the outer loop by changing to state $2. 


Datapath Circuit 


There are many ways to implement a datapath circuit that meets the requirements of 
the ASM chart in Figure 7.41. One possibility is illustrated in Figures 7.42 and 7.43. 
Figure 7.42 shows how the registers Ro, ..., Ry—; can be connected to registers A and B 
using 4-to-1 multiplexers. We assume the value k = 4 for simplicity. Registers A and B are 
connected to a comparator subcircuit and, through multiplexers, back to the inputs of the 


fori =O0tok—2do 


A= Ri; 
for j= iCltok—1do 
B=k;; 
if B < A then 
Ri = B; 
Rj =A; 
A= Rj; 
end if; 
end for; 
end for; 


Figure 7.40 _—_ Pseudo-code for the sort operation. 
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Reset 


Load registers 


C;-C;+1 


Figure 7.41 = ASM chart for the sort operation. 
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DatalIn 


Clock 


DataOut 


BItA 


Figure 7.42 — A part of the datapath circuit for the sort operation. 


registers Ro,..., Ry_1. The registers can be loaded with initial (unsorted) data using the 
DataIn lines. The data is written (loaded) into each register by asserting the Wr/nit control 
signal and placing the address of the register on the RAdd input. The tri-state driver driven 
by the Rd control signal is used to output the contents of the registers on the DataOut output. 

The signals Ring, ..., Ring; are controlled by the 2-to-4 decoder shown in Figure 
7.43. If Int = 1, the decoder is driven by one of the counters C; or C;. If Int = 0, then the 
decoder is driven by the external input RAdd. The signals z; and z; are set to 1 if C; = k — 2 
and C; = k — 1, respectively. An ASM chart that shows the control signals used in the 
datapath circuit is given in Figure 7.44. 
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Figure 7.43 — A part of the datapath circuit for the sort operation. 


Verilog Code 

Verilog code for the sorting operation is presented in Figure 7.45. The FSM that 
controls the sort operation is described in the same way as in previous examples, using the 
always blocks State_table, State_ flipflops, and FSM_outputs. Following these blocks, the 
code instantiates the registers Ro to R3, as well as A and B. The counters C; and C; have the 
instance names OuterLoop and InnerLoop, respectively. The multiplexers with the outputs 
CMux and [Mux are specified using the conditional operator. The 4-to-1 multiplexer in 
Figure 7.42 is defined by the case statement that specifies the value of the ABData signal 
for each value of JMux. The 2-to-4 decoder in Figure 7.43 with the outputs Ring, ..., Ring 
is defined by the case statement that sets the value of the concatenated signals {Rin3, Rin2, 
Rin1, RinO}. Finally, the code specifies the values of the z; and z; signals, and defines the 
tri-state drivers for the DataOut output. 
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Reset 


LI, Int =0 


$2 
Int = 1, Csel = 0, Ain, LJ 


S3 
SY) 


S4 
Bin, Csel = 1, Int = 1 


S5 
(1) Csel = 1, Int = 1, Wr, Aout 


Csel = 0, Int = 1, Wr, Bout 


S8 


Csel = 0, Int = 1, Ain 


1 


< 


Figure 7.44 = ASM chart for the control circuit. 
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module sort (Clock, Resetn, s, WrInit, Rd, DataIn, RAdd, DataOut, Done); 
parameter n= 4; 


input Clock, Resetn, s, WrInit, Rd; 
input [n— 


1:0] DataIn; 


input [1:0] RAdd; 
output [n—1:0] DataOut; 
output reg Done; 


wire [1:0] Ci, Cj, CMux, IMux; 
wire [n—1:0] RO, R1, R2, R3, A, B, RData, ABMux; 


wire BItA, zi, zj; 


reg Int, Csel, Wr, Ain, Bin, Bout; 
reg LI, LJ, El, EJ, RinO, Rin1, Rin2, Rin3; 


reg [3:0] y, Y; 
reg [n—1:0] ABData; 


// control circuit 
parameter S1 = 4’b0000, S2 = 4’b0001, S3 = 4’b0010, S4 = 4’b0011; 
parameter S5 = 4’b0100, S6 = 4’b0101, S7 = 4’b0110, S8 = 4’b0111, S9 = 4’b1000; 


always @(s, BItA, zj, zi, y) 
begin: State_table 
case (y) 


SI: 


S9: 


if (s ==0) Y=S1; 
else Y = S2; 


: Y=S3; 
: Y=S4; 
: Y=S5; 
: if (BItA) Y = S6; 


else Y = S8; 


YS S75 
: Y=S8; 
: if (!zj) Y=S4; 


else if (!zi) Y = S2; 
else Y = S9; 

if (s) Y= S9; 

else Y = S1; 


default: Y = 4’bx; 
endcase 


end 


... continued in Part b. 


Figure 7.45 


Verilog code for the sorting circuit (Part a). 
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always @(posedge Clock, negedge Resetn) 
begin: State_flipflops 
(Resetn == 0) 
y<=S1; 
else 
y <=Y; 
end 


always @(y, Zj, zi) 
begin: FSM_outputs 
// defaults 
Int = 1; Done = 0; LI = 0; LJ = 0; EI = 0; EJ = 0; Csel = 0; 
Wr = 0; Ain = 0; Bin = 0; Bout = 0; 
case (y) 
S1: begin LI = 1; Int = 0; end 
S2: begin Ain = 1; LJ = 1; end 
$3: EJ =1; 
S4: begin Bin = 1; Csel = 1; end 
S5:; // no outputs asserted in this state 
S6: begin Csel = 1; Wr = 1; end 
S7: begin Wr = 1; Bout = 1; end 


S8: begin 
Ain = 1; 
if (!zj) EJ = 1; 
else 
begin 
EJ = 0; 
if (!zi) El = 1; 
else EI = 0; 
end 
end 
S9: Done = 1; 
endcase 


end 
... continued in Part c. 


Figure 7.45 _ Verilog code for the sorting circuit (Part b). 


We implemented the code in Figure 7.45 inan FPGA chip. Figure 7.46 gives an example 
of a simulation result. Part (a) of the figure shows the first half of the simulation, from 0 
to 1.25 jus, and part (b) shows the second half, from 1.25 js to 2.5 ws. After resetting the 
circuit, Wrlnit is set to 1 for four clock cycles, and unsorted data is written into the four 
registers using the Dataln and RAdd inputs. After s is changed to 1, the FSM changes to 
state $2. States $2 to S4 load A with the contents of Ro (3) and B with the contents of 
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//datapath circuit 


regne Reg( (RData, Clock, Resetn, RinO, RO); 
defparam Reg0.n = n; 

regne Regl (RData, Clock, Resetn, Rin1, R1); 
defparam Reg1.n =n; 

regne Reg2 (RData, Clock, Resetn, Rin2, R2); 
defparam Reg?2.n = n; 

regne Reg3 (RData, Clock, Resetn, Rin3, R3); 
defparam Reg3.n = n; 


regne RegA (ABData, Clock, Resetn, Ain, A); 
defparam RegA.n =n; 

regne RegB (ABData, Clock, Resetn, Bin, B); 
defparam RegB.n = n; 


assign BItA=(B<A)?1: 0; 
assign ABMux = (Bout == 0)? A: B; 
assign RData = (WriInit == 0) ? ABMux : Dataln; 


upcount OuterLoop (2’b00, Resetn, Clock, EI, LI, Ci); 
upcount InnerLoop (Ci, Resetn, Clock, EJ, LJ, Cj); 


assign CMux = (Csel == 0) ? Ci: Cj; 
assign IMux = (Int == 1) ? CMux : RAdd; 


.. continued in Part d. 


Figure 7.45 Verilog code for the sorting circuit (Part c). 


R, (2). State S5 compares B with A, and since B < A, the FSM uses states $6 and S7 to 
swap the contents of registers Rp and R;. In state $8, A is reloaded from Ro, which now 
contains 2. Since z; is not asserted, the FSM increments the counter C; and changes back 
to state $4. Register B is now loaded with the contents of R2 (4), and the FSM changes to 
state S5. Since B = 4 is not less than A = 2, the machine changes to $8 and then back to 
S4. Register B is now loaded with the contents of R3 (1), which is then compared against 
A = 2 in state S5. The contents of Ro and R3 are swapped, and the machine changes to S8. 
At this point, the register contents are Ro = 1, Ry = 3, Ro = 4, and R3 = 2. Since z = 1 
and z; = 0, the FSM performs the next iteration of the outer loop by changing to state S2. 
Jumping forward in the simulation time, in Figure 7.46b the circuit reaches the state in 
which C; = 2, C; = 3, and the FSM is in state $8. The FSM then changes to state $9 and 
sets Done to the value 1. The correctly sorted data is read out of the registers by setting the 
signal Rd = | and using the RAdd inputs to select each of the registers. 
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always @(Wrinit, Wr, IMux, RO, RI, R2, R3) 


begin 
case (IMux) 
0: ABData = RO; 
1: ABData=RI1; 
2: ABData = R2; 
3: ABData = R3; 
endcase 


if (WrInit || Wr) 
case (IMux) 
0: {Rin3, Rin2, Rinl, RinO} = 4’b0001; 
1: {Rin3, Rin2, Rinl, RinO} = 4’b0010; 
2: {Rin3, Rin2, Rinl, RinO} = 4’b0100; 
3: {Rin3, Rin2, Rinl, Rin0} = 4’b1000; 
endcase 
else {Rin3, Rin2, Rinl, Rin0} = 4’b0000; 
end 


assign zi = (Ci == 2); 
assign § zj = (Cj == 3); 
assign DataOut = (Rd == 0) ? ’bz : ABData; 


endmodule 


Figure 7.45 _ Verilog code for the sorting circuit (Part d). 


| 7.8 CLOCK SYNCHRONIZATION AND TIMING ISSUES 


In previous sections we presented several examples of logic circuits that illustrate how a 
designer may approach the task of designing larger systems. In this section we will examine 
some timing aspects of such systems. 


7.8.1 CLOocK DISTRIBUTION 


In Chapter 5 we discussed the need to deal with the clock skew problem. For proper 
operation of synchronous sequential circuits, it is essential to minimize the clock skew as 
much as possible. Chips that contain many flip-flops, such as PLDs, use carefully designed 
networks of wires to distribute the clock signal to the flip-flops. Figure 7.47 gives an 
example of a clock-distribution network. Each node labeled ff represents the clock input 
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(b) Completing the sort operation and reading the registers 


Figure 7.46 — Simulation results for the sort operation. 
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of a flip-flop; for clarity, the flip-flops are not shown. The buffer on the left of the figure 
produces the clock signal. This signal is distributed to the flip-flops such that the length of 
the wire between each flip-flop and the clock source is the same. Due to the appearance of 
wires, which resemble the letter H, the clock distribution network is known 
In PLDs the term global clock refers to the clock network. A PLD chip 
usually provides one or more global clocks that can be connected to all flip-flops. When 
designing a circuit to be implemented in such a chip, a good design practice is to connect 


sections of the 
as an H tree. 
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Figure 7.47 ~— AnH tree clock distribution network. 


all the flip-flops in the circuit to a single global clock. 


It is useful to be able to ensure that a sequential circuit is reset into a known state when 
power is first applied to the circuit. A good design practice is to connect the asynchronous 
reset (clear) inputs of all flip-flops to a wiring network that provides a low-skew reset signal. 


PLDs usually provide a global reset wiring network for this purpose. 
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7.8.2 Fuiie-FLop TIMING PARAMETERS 


We discussed the timing parameters for storage elements in Chapter 5. Data to be clocked 
into a flip-flop must be stable t,,, before the active clock edge and must remain stable ¢,, after 
the clock edge. A change in the value of the output Q appears after the clock-to-Q delay, 
t-g. An output delay time, toa, is required for the change in Q to propagate to an output pin 
on the chip. These timing parameters account for the behavior of an individual flip-flop 
without considering how the flip-flop is connected to other circuitry in an integrated circuit 
chip. 

Figure 7.48 depicts a flip-flop as part of an integrated circuit. Connections are shown 
from the flip-flop’s clock, D, and Q terminals to pins on the chip package. There is an input 
buffer associated with each pin on the chip. Other circuitry may also be connected to the 
flip-flop; the shaded box represents a combinational circuit connected to D. The propagation 
delays between the pins on the chip package and the flip-flop are labeled in the figure as 
tpata> tClock> and lod: 

In digital systems the output signals from one chip are used as the input signals to 
another chip. Often the flip-flops in all chips are driven by a common clock that has low 
skew. The signals must propagate from the Q outputs of flip-flops in one chip to the D 
inputs of flip-flops in another chip. To ensure that all timing specifications are met, it is 
necessary to consider the output delays in one chip and the input delays in another. 

The f,, delay determines how long it takes from when an active clock edge occurs at 
the clock pin on the chip package until a change in the output of a flip-flop appears at an 
output pin on the chip. This delay consists of three main parts. The clock signal must first 
propagate from its input pin on the chip to the flip-flop’s Clock input. This delay is labeled 
tclock in Figure 7.48. After the clock-to-Q delay f-g, the flip-flop produces a new output, 
which takes f,4 to propagate to the output pin. An example of timing parameters taken from 
a commercial CPLD chip is fcjock = 1.5 ns, fog = 1 ns, and fog = 2 ns. These parameters 
give the delay from the active clock edge to the change on the output pin as f,. = 4.5 ns. 


= lata 
Chip package pin 
Data 


|e Clock Seen | lod 


Figure 7.48 A flip-flop in an integrated circuit chip. 
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Figure 7.49 __Flip-flop timing in a chip. 


If chips are separated by a large distance, the propagation delays between them must 
be taken into consideration. But in most cases the distance between chips is small, and the 
propagation time of signals between the chips is negligible. Once a signal reaches the input 
pin on a chip, the relative values of tpgia and tciock (see Figure 7.48) must be considered. 
For example, in Figure 7.49 we assume that fpgiqg = 4.5 ns and fcigc, = 1.5 ns. The setup 
time for the flip-flops in the chip is specified as ¢,,, = 3 ns. In the figure the Data signal 
changes from low to high 3 ns before the positive clock edge, which should meet the setup 
requirements. The Data signal takes 4.5 ns to reach the flip-flop, whereas the Clock signal 
takes only 1.5 ns. The signal labeled A and the clock signal labeled B reach the flip-flop 
at the same time. The setup time requirement is violated, and the flip-flop may become 
unstable. To avoid this condition, it is necessary to increase the setup time as seen from 
outside the chip. 

The hold time for flip-flops is also affected by chip-level delays. The result is usually a 
reduction in the hold time, rather than an increase. For example, with the timing parameters 
in Figure 7.49 assume that the hold time is t, = 2 ns. Assume that the signal at the Data pin 
on the chip changes value at exactly the same time that an active edge occurs at the Clock 
pin. The change in the Clock signal will reach node B 4.5 — 1.5 = 3 ns before the change 
in Data reaches node A. Hence even though the external change in Data is coincident with 
the clock edge, the required hold time of 2 ns is not violated. 

For large circuits, ensuring that flip-flop timing parameters are properly adhered to is 
a challenge. Both the timing parameters of the flip-flops themselves and the relative delays 
incurred by the clock and data signals must be considered. CAD systems provide tools that 
can check the setup and hold times at all flip-flops automatically. This task is done using 
timing simulation, as well as special-purpose timing-analysis tools. 


7.8.3 ASYNCHRONOUS INPUTS TO FLIP-FLOPS 


In our examples of synchronous sequential circuits, we have assumed that changes in all 
input signals occur shortly after an active clock edge. The rationale for this assumption is 
that the inputs to one circuit are produced as the outputs of another circuit, and the same 
clock signal is used for both circuits. In practice, some of the inputs to a circuit may be 
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Figure 7.50 — Asynchronous inputs. 


generated asynchronously with respect to the clock signal. If these signals are connected 
to the D input of a flip-flop, then the setup or hold times may be violated. 

When a flip-flop’s setup or hold times are violated, the flip-flop’s output may assume a 
voltage level that does not correspond to either logic value 0 or 1. We say that the flip-flop is 
ina metastable state. The flip-flop eventually settles in one of the stable states, 0 or 1, but the 
time required to recover from the metastable state is not predictable. A common approach 
for dealing with asynchronous inputs is illustrated in Figure 7.50. The asynchronous data 
input is connected to a two-bit shift register. The output of the first flip-flop, labeled A in 
the figure, will sometimes become metastable. But if the clock period is sufficiently long, 
then A will recover to a stable logic value before the next clock pulse occurs. Hence the 
output of the second flip-flop will not become metastable and can safely be connected to 
other parts of the circuit. The synchronization circuit introduces a delay of one clock cycle 
before the signal can be used by the rest of the circuit. 

Commercial chips, such as PLDs, specify the minimum allowable clock period that has 
to be used for the circuit in Figure 7.50 to solve the metastability problem. In practice, it is 
not possible to guarantee that node A will always be stable before a clock edge occurs. The 
data sheets specify a probability of node A being stable, as a function of the clock period. 
We will not pursue this issue further; the interested reader can refer to references [10, 11] 
for a more detailed discussion. 


7.8.4 SwitcH DEBOUNCING 


Inputs to a logic circuit are sometimes generated by mechanical switches. A problem with 
such switches is that they bounce away from their contact points when changed from one 
position to the other. Figure 7.51a shows a single-pole single-throw switch that provides 
an input to a logic circuit. If the switch is open, then the Data signal has the value 1. When 
the switch is thrown to the closed position, Data becomes 0, but the switch bounces for 
some time, causing Data to oscillate between | and 0. The bouncing typically persists for 
about 10 ms. 

One approach to deal with this problem is to add a capacitor, C, between the Data node 
and ground. When the switch is open, C is charged to the voltage Vpp. When the switch is 
thrown to the closed position, C is quickly discharged to 0 V (ground) through the switch 
which has a resistance close to zero ohm. Now, when the switch contact bounces away 
from its closed position, C will start to charge toward Vpp again. But, this charging will be 
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(a) Single-pole single-throw switch 


Vpp 


Data 


Vpp 


(b) Single-pole double-throw switch with a basic SR latch 


Figure 7.51 = Switch debouncing circuit. 


slower due to the resistor R. By choosing the right sizes of R and C, it is possible to prevent 
the Data node from significantly changing its voltage during bouncing. 

Another possible solution is to use a circuit, such as a counter, to measure an appropri- 
ately long delay to wait for the bouncing to stop (see Problem 7.20). 

A good approach for dealing with switch bouncing is depicted in Figure 7.51b. It uses a 
single-pole double-throw switch and a basic SR latch to generate an input to a logic circuit. 


PROBLEMS 


When the switch is in the bottom position, the R input on the latch is 0 and Data = 0. When 
the switch is thrown to the top position, the S input on the latch becomes 0, which sets 
Data to 1. If the switch bounces away from the top position, the inputs to the latch become 
R=S = 1 and the value Data = 1 is stored by the latch. When the switch is thrown to 
the bottom position, Data changes to 0 and this value is stored in the latch if the switch 
bounces. Note that when a switch bounces, it cannot bounce fully between the S and R 
terminals; it only bounces slightly away from one of the terminals and then back to it. 
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7.9 CONCLUDING REMARKS 


This chapter has provided several examples of digital systems that include one or more 
FSMs as well as building blocks like adders, registers, shift registers, and counters. We have 
shown how ASM charts can be used as an aid for designing a digital system, and we have 
shown how the circuits can be described using Verilog code. A number of practical issues 
have been discussed, such as clock distribution, synchronization of asynchronous inputs, 
and switch debouncing. Some notable books that also deal with the material presented in 
this chapter include [3-10]. 


PROBLEMS 


7.1 InSection 7.1 we showed a digital system with three registers, R1 to R3, and we designed a 
control circuit that can be used to swap the contents of registers Rl and R2. Give an ASM 


chart that represents this digital system and the swap operation. 


7.2 (a) For the ASM chart derived in Problem 7.1, show another ASM chart that specifies the 
required control signals to control the datapath circuit. Assume that multiplexers are used 


to implement the bus that connects the registers, as shown in Figure 7.4. 


(b) Write complete Verilog code for the system in Problem 7.1, including the control circuit 


described in part (a). 


(c) Synthesize a circuit from the Verilog code written in part (b) and show a timing simulation 


that illustrates correct functionality of the circuit. 


7.3 In Section 7.2 we designed a processor that performs the operations listed in Table 7.1. 
Design a modified circuit that performs an additional operation Swap Rx, Ry. This operation 
swaps the contents of registers Rx and Ry. Use three bits f> f| fo to represent the input F 
shown in Figure 7.11 because there are now five operations, rather than four. Add a new 
register, named Tmp, into the system, to be used for temporary storage during the swap 
operation. Show logic expressions for the outputs of the control circuit, as was done in 


Section 7.2. 


7.4 In Section 7.2 we gave the design for a circuit that works as a processor. Give an ASM 


chart that describes the functionality of this processor. 
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(a) For the ASM chart derived in Problem 7.4, show another ASM chart that specifies 
the required control signals to control the datapath circuit in the processor. Assume that 
multiplexers are used to implement the bus that connects the registers, RO to R3, in the 
processor. 

(b) Write complete Verilog code for the system in Problem 7.4, including the control circuit 
described in part (a). 

(c) Synthesize a circuit from the Verilog code written in part (b) and show a timing simulation 
that illustrates correct functionality of the circuit. 


The ASM chart in Figure 7.17, which describes the bit-counting circuit, includes Moore-type 
outputs in states $1, $2, and $3, and it has a Mealy-type output in state $2. 

(a) Show how the ASM chart can be modified such that it has only Moore-type outputs in 
state S'2. 

(b) Give the ASM chart for the control circuit corresponding to part (a). 

(c) Give Verilog code that represents the modified control circuit. 


Figure 7.24 shows the datapath circuit for the shift-and-add multiplier. It uses a shift register 
for B so that bo can be used to decide whether or not A should be added to P. A different 
approach is to use a normal register to hold operand B and to use a counter and multiplexer 
to select bit b; in each stage of the multiplication operation. 

(a) Show the ASM chart that uses a normal register for B, instead of a shift register. 

(b) Show the datapath circuit corresponding to part (a). 

(c) Give the ASM chart for the control circuit corresponding to part (b). 

(d) Give Verilog code that represents the multiplier circuit. 


Write Verilog code for the divider circuit that has the datapath in Figure 7.30 and the control 
circuit represented by the ASM chart in Figure 7.31. 


Section 7.5 shows how to implement the traditional long division that is done by “hand.” 
A different approach for implementing integer division is to perform repeated subtraction 
as indicated in the pseudo-code in Figure P7.1. 

(a) Give an ASM chart that represents the pseudo-code in Figure P7.1. 

(b) Show the datapath circuit corresponding to part (a). 

(c) Give the ASM chart for the control circuit corresponding to part (b). 

(d) Give Verilog code that represents the divider circuit. 

(e) Discuss the relative merits and drawbacks of your circuit in comparison with the circuit 
designed in Section 7.5. 


Q=0; 

R=A; 

while ((R — B) > 0) do 
R=R-B; 
Q=Q+1; 

end while; 


Figure P7.1 Pseudo-code for integer division. 


7.10 


7.11 
7.12 


7.13 


7.14 


7.15 


7.16 


7.17 


7.18 
7.19 


PROBLEMS 487 


In the ASM chart in Figure 7.39, the two states $3 and $4 are used to compute the mean 
M = Sum/k. Show a modified ASM chart that combines states S3 and $4 into a single 
state, called $3. 


Write Verilog code for the FSM represented by your ASM chart defined in Problem 7.10. 


In the ASM chart in Figure 7.41, we specify the assignment C; <— C; in state $2, and then 
in state S3 we increment C; by 1. Is it possible to eliminate state $3 if the assignment 
C; <— C; + 1 is performed in $2? Explain any implications that this change has on the 
control and datapath circuits. 


Figure 7.40 gives pseudo-code for the sorting operation in which the registers being sorted 
are indexed using variables i and j. In the ASM chart in Figure 7.41, variables 7 and j are 
implemented using the counters C; and C;. A different approach is to implement i and j 
using two shift registers. 

(a) Redesign the circuit for the sorting operation using the shift registers instead of the 
counters to index registers Ro, ..., Rs. 

(b) Give Verilog code for the circuit designed in part (a). 

(c) Discuss the relative merits and drawbacks of your circuit in comparison with the circuit 
that uses the counters C; and Cj. 


Design a circuit that finds the /ogs of an operand that is stored in an n-bit register. Show 
all steps in the design process and state any assumptions made. Give Verilog code that 
describes your circuit. 


The circuit designed in Section 7.6 uses an adder to compute the sum of the contents of 
the registers. The divider subcircuit used to compute M = Sum/k also includes an adder. 
Show how the circuit can be redesigned so that it contains only a single adder subcircuit 
that is used both for the summation operation and the division operation. Show only the 
extra circuitry needed to connect to the adder; and explain its operation. 


Give Verilog code for the circuit designed in Problem 7.15, including both the datapath and 
control circuits. 


The pseudo-code for the sorting operation given in Figure 7.40 uses registers A and B to 
hold the contents of the registers being sorted. Show pseudo-code for the sorting operation 
that uses only register A to hold temporary data during the sorting operation. Give a 
corresponding ASM chart that represents the datapath and control circuits needed. Use 
multiplexers to interconnect the registers, in the style shown in Figure 7.42. Give a separate 
ASM chart that represents the control circuit. 


Give Verilog code for the sorting circuit designed in Problem 7.17. 


Consider the design of a circuit that controls the traffic lights at the intersection of two roads. 
The circuit generates the outputs G1, Y1, Rl and G2, Y2, R2. These outputs represent the 
states of the green, yellow, and red lights, respectively, on each road. A light is turned on 
if the corresponding output signal has the value 1. The lights have to be controlled in the 
following manner: when G1 is turned on it must remain on for a time period called tf, and 
then be turned off. Turning off G1 must result in Y 1 being immediately turned on; it should 
remain on for a time period called ft) and then be turned off. When either G1 or Y 1 is on, R2 
must be on and G2 and Y2 must be off. Turning off Y 1 must result in G2 being immediately 
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turned on for the ¢, time period. When G2 is turned off, Y2 is turned on for the f¢) time 
period. Of course, when either G2 or Y2 is turned on, R1 must be turned on and G1 and 
Y1 must be off. 

(a) Give an ASM chart that describes the traffic-light controller. Assume that two down- 
counters exist, one that is used to measure the t; delay and another that is used to measure 
ty. Each counter has parallel-load and enable inputs. These inputs are used to load an 
appropriate value representing either the ¢, or fy delay and then allow the counter to count 
down to 0. 

(b) Give an ASM chart for the control circuit for the traffic-light controller. 

(c) Write complete Verilog code for the traffic-light controller, including the control circuit 
from part (a) and counters to represent ¢; and f2. Use any convenient clock frequency to 
clock the circuit and assume convenient count values to represent f; and f2. Give simulation 
results that illustrate the operation of your circuit. 


Assume that you need to use a single-pole single-throw switch as shown in Figure 7.5 1a. 
Show how a counter can be used as a means of debouncing the Data signal produced by the 
switch. (Hint: Design an FSM that has Data as an input and produces the output z, which 
is the debounced version of Data. Assume that you have access to a Clock input signal with 
the frequency 102.4 kHz, which can be used as needed.) 


Clock signals can be generated using special purpose chips. One example of such a chip 
is the 555 programmable timer, which is depicted in Figure P7.2. By choosing particular 
values for the resistors R, and R, and the capacitor C), the 555 timer can be used to produce 


Clock 
(output) 


Figure P7.2 = The 555 programmable timer chip. 
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a desired clock signal. It is possible to choose both the period of the clock signal and its 
duty cycle. The term duty cycle refers to the percentage of the clock period for which the 
signal is high. The following equations define the clock signal produced by the chip 


Clock period = 0.7(Rg + 2Rp)Ci 
Ra - Ry 
Ra + 2Rp 


(a) Determine the values of R,, R,, and C; needed to produce a clock signal with a 50 
percent duty cycle and a frequency of about 500 kHz. 
(b) Repeat part (a) for a duty cycle of 75 percent. 


Duty cycle = 
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OPTIMIZED IMPLEMENTATION OF LOGIC 
FUNCTIONS 


CHAPTER OBJECTIVES 


In this chapter you will learn about: 

e Synthesis of logic functions 

e Analysis of logic circuits 

e Algorithmic techniques for the optimization of logic functions 

e Cubical representation of logic functions 

e Representation of logic functions using binary decision diagrams 
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In Chapter 2 we showed how to find the lowest-cost implementations of logic functions by using either 
algebraic manipulation or Karnaugh maps. The reader is probably convinced that it is not at all obvious 
how to apply the theorems of Boolean algebra to find a minimum-cost circuit. Karnaugh maps provide a 
systematic way of manually deriving minimum-cost implementations of simple logic functions, but they 
become impractical for functions of many variables. 

If CAD tools are used to design logic circuits, the task of minimizing the cost of implementation does not 
fall to the designer; the tools perform the necessary optimizations automatically. Even so, it is important to 
know something about this process. Most CAD tools have many features and options that are under control 
of the user. To know when and how to apply these options, the user must have an understanding of what the 
tools do. 

In this chapter we will introduce some of the optimization techniques implemented in CAD tools and 
show how these techniques can be automated. We first discuss synthesis techniques that can be used to produce 
multilevel logic circuits, rather than just the two-level sum-of-products and product-of-sums forms that we 
studied in Chapter 2. Then we will describe a cubical representation of logic functions, as well as a graphical 
representation called binary decision diagrams (BDDs). We will discuss some algorithmic approaches for 
circuit optimization that are based on these representations. 


| 8.1 MULTILEVEL SYNTHESIS 


In the previous chapters we have usually implemented logic functions in the sum-of- 
products or product-of-sums form. Logic circuits of this type have two levels of gates. 
In the sum-of-products form, the first level comprises AND gates that are connected to 
a second-level OR gate. In the product-of-sums form, the first-level OR gates feed the 
second-level AND gate. We have assumed that both true and complemented versions of the 
input variables are available so that NOT gates are not needed to complement the variables. 

A two-level realization is usually efficient for functions of a few variables. However, 
as the number of inputs increases, a two-level circuit may result in logic gates that have too 
many inputs. Whether or not this is an issue depends on the type of technology that is used 
to implement the circuit. For example, consider the following function: 


Sf (X14 66 X7) = X1XZXE A AX4XsXe + XQNZX7 + XQN4XSX7 


This is a minimum-cost SOP expression. Now consider implementing f in two types of 
chips: a CPLD and an FPGA. Figure 8.1 shows a logic element in a typical CPLD, like 
those shown in Appendix B. The figure indicates in blue the circuitry used to realize the 
function f. The CPLD architecture is well-suited for SOP implementations. 

Next, consider implementing f in an FPGA. For this example we will assume an FPGA 
in which the logic elements are four-input lookup tables (LUTs). As discussed in Appendix 
B, a four-input LUT can implement any logic function that has up to four inputs. Since the 
SOP expression for f requires three- and four-input AND operations and a four-input OR, 
each of these logic operations could be realized in a separate LUT. Four LUTs would be 
needed for the AND operations, plus one additional LUT for the OR operation, giving five 
LUTs in total. 
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(from interconnection wires) 


X; XX, %3 Xg Xs Xe Xz unused 


Part of a logic element 


Figure 8.1 Implementation in a CPLD. 


A more efficient implementation in the FPGA architecture can be obtained if f is 
expressed in a form that has more than two levels of logic operations. Such a form is 
called a multilevel logic expression. There are several different approaches for synthesis 
of multilevel circuits. We will discuss two important techniques known as factoring and 
functional decomposition. 


8.1.1 FACTORING 


The distributive property of Boolean algebra allows us to factor the preceding expression 
for f as follows 


f = X1X6(x3 + X4x5) + X2X7(X3 + X4X5) 
= (%1%65 + X2X7) (13 + X4X5) 


The corresponding circuit can be realized using only two LUTs, as shown in Figure 8.2. The 
first LUT implements the expression g = (x|X6 + x2x7), and the second LUT implements 
f = 8 + (3 + x4X5). 

Fan-in Problem 

In the preceding example, the number of inputs, or fan-in, of each LUT is restricted 
to four because of the FPGA architecture. But even if we were to make a custom chip to 
implement a circuit it would still be necessary to limit the fan-in of gates. This limit is 
caused by practical issues related to transistor technology, which is discussed in Appendix 
B. Suppose that the available gates in a custom chip have a maximum fan-in of four. Then, 
if a logic expression includes a seven-input product term, we would have to use 2 four-input 
AND gates, as indicated in Figure 8.3. 
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Figure 8.2 — Using three LUTs after factoring. 


7 inputs 


Figure 8.3 Using four-input AND gates to realize a seven-input 
product term. 


Factoring can be used to deal with the fan-in problem. As an example, consider the 
function 


Sf = Xp X2X3X4XsXG A Xj XQN3X4X5XG 


This is a minimal sum-of-products expression. Using the approach of Figure 8.3, we will 
need four AND gates and one OR gate to implement this expression. A better solution is to 
factor the expression as follows 


SF = 1 X4X6 (X2X3X5 + X2X3X5) 


Then three AND gates and one OR gate suffice for realization of the required function, as 
shown in Figure 8.4. 


Example 8.1 


In practical situations a designer of logic circuits often encounters specifications that natu- 
rally lead to an initial design where the logic expressions are in a factored form. Suppose 
we need a circuit that meets the following requirements. There are four inputs: x1, x2, x3, 
and x4. An output, fj, must have the value | if at least one of the inputs x; and x2 is equal 
to 1 and both x3 and x4 are equal to 1; it must also be | if x; = x2 = 0 and either x3 or x4 
is 1. In all other cases f; = 0. A different output, fo, is to be equal to | in all cases except 
when both x, and x2 are equal to 0 or when both x3 and x4 are equal to 0. 
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Figure 8.4 A factored circuit. 


From this specification, the function f; can be expressed as 
fi = Orr + X2)x3x4 + X1%2 (03 + X4) 
This expression can be simplified to 
Si = 3x4 + X1X2 (x3 + X4) 


which the reader can verify by using a Karnaugh map. 
The second function, fs, is most easily defined in terms of its complement, such that 


fy = HiX2 + X34 
Then using DeMorgan’s theorem gives 
fo = (%1 + X2)(%3 + x4) 


which is the minimum-cost expression for f); the cost increases significantly if the SOP 
form is used. 

Because our objective is to design the lowest-cost combined circuit that implements f; 
and f>, it seems that the best result can be achieved if we use the factored forms for both 
functions, in which case the sum term (x3 + x4) can be shared. Moreover, observing that 
X 1X2. = X, + Xo, the sum term (x, + x2) can also be shared if we express f, in the form 


Si = 3x4 + xy + X2(x3 + X4) 


Then the combined circuit, shown in Figure 8.5, comprises three OR gates, three AND 
gates, one NOT gate, and 13 inputs, for a total cost of 20. 
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nal 
xX 

fi 
x3 iD) 
x4 


Figure 8.5 = Circuit for Example 8.1. 


8.1.2 FUNCTIONAL DECOMPOSITION 


In the preceding examples, which illustrated the factoring approach, multilevel circuits 
were used to deal with fan-in limitations. However, such circuits may be preferable to 
their two-level equivalents even if fan-in is not a problem. In some cases the multilevel 
circuits may reduce the cost of implementation. On the other hand, they usually imply 
longer propagation delays, because they use multiple stages of logic gates. We will explore 
these issues by means of illustrative examples. 

Complexity of a logic circuit can often be reduced by decomposing a two-level circuit 
into subcircuits, where one or more subcircuits implement functions that may be used in 
several places to construct the final circuit. To achieve this objective, a two-level logic 
expression is replaced by two or more new expressions, which are then combined to define 
a multilevel circuit. We can illustrate this idea by a simple example. 


Example 8.2 


Consider the four-variable function f shown in the Karnaugh map in Figure 8.6a. The map 
indicates how the f can be realized in minimum SOP form, leading to the expression 


Sf = Xp X3xX4 + Xp X3X4 + XQHZX4 + X2H3X4 


This expression has seven gates and 18 inputs to gates, for a total cost of 25. The reader 
should observe that in this case we have assumed that inputs are available only in their true 
form, and have therefore included the cost of NOT gates needed to complement x3 and x4. 

To perform functional decomposition we need to find one or more subfunctions that 
depend on only a subset of the input variables. In this example we will attempt to find a 
subfunction that depends only on the inputs x3 and x4. Factoring x; from the first two terms 
and x from the last two terms, we have 


Sf = XH 3x4 + 3X4) + X2(K3X4 + ¥3X4) 
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X4X3X4 
X1X3X4 


X4X3X4 


Xj X3X4 


(a) Product terms (b) Multilevel circuit 


Figure 8.6 The function for Example 8.2. 


Now let g(%3, x4) = 3X4 + x3xX4 and observe that g = X3x4 + x3x4. The decomposed func- 
tion f can be written as 


f= 11g + 228 


Observing that g = x3 © x4 leads to the circuit shown in Figure 8.6b. It has five logic gates 
plus nine inputs to gates, leading to a total cost of 13. The cost of this multilevel circuit is 
significantly lower than that of the SOP implementation of f. The trade-off is an increased 
propagation delay because the circuit has an extra level of logic. 

Instead of using algebraic manipulation to find subfunctions, a more systematic ap- 
proach can be followed by examining the Karnaugh map. Figure 8.7a highlights subfunc- 
tion g in blue. This subfunction depends only on the variables that define the rows of the 
Karnaugh map, namely x3 and x4. We have 


& = X3X4 + X3X4 
= x3 B x4 


Since g is in the rightmost column of the map, where x; = 1 and x2 = 0, the part of f 
defined by g is given by (x;xX2) - g. Now, we can observe that the second column in the 
Karnaugh map represents the subfunction g, and the third column is a subfunction equal to 
1. Thus, f can be implemented using a three-variable function h(x, x2, g) such that 
f O1, X2, 43, X4) = Alx1, x2, 8(%3, x4)] 
= 1X2) - 8 + (11%X2) +B + Cix2) +1 

The structure of this decomposition is illustrated in Figure 8.7b. We can optimize the 
expression for h as shown in Figure 8.7c. The result is 


h=x1g+xg 


which is the same decomposition that we derived above by using algebraic manipulation. 
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X3 — 
” 8 
xX, —> 
+) — 


Xy —— 


(b) The structure of decomposition 


(c) Karnaugh map for A(x, %, g) 


Figure 8.7 — Subfunctions used in decomposition. 


In Figure 8.7a we identified subfunctions by first looking at the columns of the map. 
The same decomposition could also have been derived by first examining the rows of the 
map, which is the approach that we explore in the following example. 


Example 8.3 


Figure 8.8a defines a five-variable function f in the form of a Karnaugh map. Examining 
the rows of the map, we can see that there are only two distinct patterns, which means 
that only two subfunctions are needed to implement the rows. The second and fourth rows 
have one pattern, highlighted in blue, while the first and third rows have the other pattern. 
Consider the second row. It depends only on the variables that define columns in each row, 
namely, x1, x2, and x5. Let a subfunction g(x), x2, x5) represent the pattern in this row. This 
subfunction is just 


& =X +X2+X5 


because the pattern has a | wherever any of these variables is equal to 1. To specify the 
location of rows where the pattern g occurs, which is in rows 2 and 4, we use the variables 
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x5 = 0 x5 = 1 


(a) Karnaugh map for the function f 


x 


X5 


43 
X4 k 
(b) Circuit obtained using decomposition 


Figure 8.8 — Decomposition for Example 8.3. 


x3 and x4. The terms 3x4 and x3x4 identify the second and fourth rows, respectively. Thus 
the expression (*3x4 + x3X4) - g represents the part of f that is defined in rows 2 and 4. 

Next, we have to find a realization for the pattern in rows | and 3. This pattern has 
a 1 only in the cell where x; = x2 = x5 = 0, which corresponds to the term x)X2x5. But 
we can make a useful observation that this term is just a complement of g. The location 
of rows | and 3 is identified by terms 3x4 and x3x4, respectively. Thus the expression 
(x3X4 + x3x4) - g represents f in rows | and 3. 

We can make one other useful observation. The expressions (%3x4 + x3X4) and (%3X%4 + 
x3X4) are complements of each other. Therefore, if we let k(x3, x4) = X3x4 + x3X4, the 
complete decomposition of f can be stated as 


S (1, X2, X3, X4, X5) = hg (1, x2, X5), K(x3, X4)] 
=kget+kg=k@g 
where g =X +42 +5 


k= X3X4 + x3X4 = X3 @ X4 
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The resulting circuit is given in Figure 8.8). It requires a total of three gates and seven 
inputs to gates, for a total cost of 10. The largest fan-in of any gate is three. 
For comparison, a minimum-cost sum-of-products expression for f is 


f = 1X34 + X1x3X4 A XQNZX4 A XYNZX4q + AZX4Xs + NZX4XS + N|NQX3X4Xs + Hy XNZxX4X5 


The corresponding circuit requires a total of 14 gates (including the five NOT gates to 
complement the primary inputs) and 41 inputs, giving a total cost of 55. The fan-in for the 
output OR gate is eight. Obviously, functional decomposition results in a much simpler 
implementation of this function. 


In the two preceding examples, the decomposition is such that a decomposed subfunc- 
tion depends on some primary input variables, whereas the remainder of the implementation 
depends on the rest of the variables. Such decompositions are called disjoint decomposi- 
tions in the technical literature. It is possible to have a non-disjoint decomposition, where 
the variables of the subfunction are also used in realizing the remainder of the circuit. The 
following example illustrates this possibility. 


Example 8.4 


Suppose that we wish to implement the XOR function using only NAND gates. Figure 8.9a 
shows a SOP implementation of this function, and part (b) of the figure gives a NAND- 
NAND circuit that has the same structure. 

Let us now try to exploit functional decomposition to find a better implementation of 
XOR using only NAND gates. Let the symbol + represent the NAND operation so that 
xX, t X2 =X, -X2. Asum-of-products expression for the XOR function is 


XxX) BX2 = XyX2 + X1XQ 


From the discussion in Section 2.7, this expression can be written in terms of NAND 
operations as 


X} BX. = (x) F X2) F(X F x2) 


This expression requires five NAND gates, and it is implemented by the circuit in Figure 
8.9b. Observe that an inverter is implemented using a two-input NAND gate by tying the 
two inputs together. 

To find a decomposition, we can manipulate the term (x; + X2) as follows: 


(x1 f X2) = (1X2) = 0 1 + X2)) = G1 1 + X2)) 


We can perform a similar manipulation for (x; + x2) to generate 


XxX, Bx. = (1 F (1 + %X2)) F (1 + X2) F x2) 


DeMorgan’s theorem states that x; + X2 = x; t x2; hence we can write 


x1 Bx. = i tr F x2)) TCG F 2) F 2) 
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mal 
Xx; OX, 
x2 
(a) Sum-of-products implementation 
x 
xX; OX, 
x2 


(b) NAND gate implementation 


(c) Optimal NAND gate implementation 


Figure 8.9 Implementation of XOR. 


Now we have a decomposition 
X1 Ox. = X1 ft 8) f(g F x2) 
HX tx 


The corresponding circuit, which requires only four NAND gates, is given in Figure 8.9c. 
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Practical Issues 

Functional decomposition is a powerful technique for reducing the complexity of cir- 
cuits. A big problem in functional decomposition is finding the possible subfunctions. For 
functions of many variables, an enormous number of possibilities should be tried. This 
situation precludes attempts at finding optimal solutions. Instead, heuristic approaches that 
lead to acceptable solutions are used. 

Full discussion of functional decomposition and factoring is beyond the scope of this 
book. An interested reader may consult other references [1-4]. Modern CAD tools use the 
concept of decomposition extensively. 


8.1.3. MULTILEVEL NAND AnD NOR CIRcuITS 


In Section 2.7 we showed that two-level circuits consisting of AND and OR gates can 
be easily converted into circuits that can be realized with NAND and NOR gates, using 
the same gate arrangement. In particular, an AND-OR (sum-of-products) circuit can be 
realized as a NAND-NAND circuit, while an OR-AND (product-of-sums) circuit becomes 
a NOR-NOR circuit. The same conversion approach can be used for multilevel circuits. 
We will illustrate this approach by an example. 


Example 8.5 


Figure 8.10a gives a four-level circuit consisting of AND and OR gates. Let us first derive 
a functionally equivalent circuit that comprises only NAND gates. Each AND gate is 
converted to a NAND by inverting its output. Each OR gate is converted to a NAND by 
inverting its inputs. This is just an application of DeMorgan’s theorem, as illustrated in 
Figure 2.26. Figure 8.10b shows the necessary inversions in blue. Note that an inversion is 
applied at both ends of a given wire. Now each gate becomes a NAND gate. This accounts 
for most of the inversions added to the original circuit. But, there are still four inversions 
that are not a part of any gate; therefore, they must be implemented separately. These 
inversions are at inputs x), x5, x6, and x7 and at the output f. They can be implemented as 
two-input NAND gates, where the inputs are tied together. The resulting circuit is shown 
in Figure 8.10c. 

A similar approach can be used to convert the circuit in Figure 8.10qa into a circuit that 
comprises only NOR gates. An OR gate is converted to a NOR gate by inverting its output. 
An AND becomes a NOR if its inputs are inverted, as indicated in Figure 2.26. Using this 
approach, the inversions needed for our sample circuit are shown in blue in Figure 8.1 1a. 
Then each gate becomes a NOR gate. The three inversions at inputs x2, x3, and x4 can be 
realized as two-input NOR gates, where the inputs are tied together. The resulting circuit 
is presented in Figure 8.11. 

It is evident that the basic topology of a circuit does not change substantially when 
converting from AND and OR gates to either NAND or NOR gates. However, it may be 
necessary to insert additional gates to serve as NOT gates that implement inversions not 
absorbed as a part of other gates in the circuit. 
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X7 


(a) Circuit with AND and OR gates 


Xq 


(b) Inversions needed to convert to NANDs 


(c) NAND-gate circuit 


Figure 8.10 = Conversion to a NAND-gate circuit. 
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x) 
X5 
X3 
ey y 
X5 
X6 
Ay 
(a) Inversions needed to convert to NORs 
al 
x 
x3 
f 
x4 
X5 
X6 
X7 


(b) NOR-gate circuit 


Figure 8.11 = Conversion to a NOR-gate circuit. 


| 8.2 ANALYSIS OF MULTILEVEL CIRCUITS 


The preceding section showed that it may be advantageous to implement logic functions 
using multilevel circuits. It also presented the most commonly used approaches for syn- 
thesizing functions in this way. In this section we will consider the task of analyzing an 
existing circuit to determine the function that it implements. 

For two-level circuits the analysis process is simple. If a circuit has an AND-OR 
(NAND-NAND) structure, then its output function can be written in the SOP form by 
inspection. Similarly, it is easy to derive a POS expression for an OR-AND (NOR-NOR) 


8.2 ANALYSIS OF MULTILEVEL CIRCUITS 


circuit. The analysis task is more complicated for multilevel circuits because it is difficult to 
write an expression for the function by inspection. We have to derive the desired expression 
by tracing the circuit and determining its functionality. The tracing can be done either 
starting from the input side and working towards the output, or by starting at the output side 
and working back towards the inputs. At intermediate points in the circuit, it is necessary 
to evaluate the subfunctions realized by the logic gates. 
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Figure 8.12 replicates the circuit from Figure 8.10a. To determine the function f imple- 
mented by this circuit, we can consider the functionality at internal points that are the outputs 
of various gates. These points are labeled P; to Ps in the figure. The functions realized at 
these points are 

P 1 = XX3 

Py = x5 + X6 

P3 = x1 + Py =X + x2%3 

P4 = x4P2 = X4(x5 + X6) 

Ps = Pat x7 = X4(X5 + X6) + X7 


Then f can be evaluated as 
f = P3Ps 
= (x1 + X2X3)(X4(%5 + X6) + X7) 
Applying the distributive property to eliminate the parentheses gives 
f= xyx4xs + XY X4X6 + AX] A AAzA4XS + XDAZA4XG + XIH3AT 


Note that the expression represents a circuit comprising six AND gates, one OR gate, and 25 
inputs. The cost of this two-level circuit is higher than the cost of the circuit in Figure 8.12, 
but the circuit has lower propagation delay. 


Figure 8.12 — Circuit for Example 8.6. 


Example 8.6 
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Example 8.7 


xy 
a) 


x5 


x3 


X4 


In Example 8.3 we derived the circuit in Figure 8.8b. This circuit is reproduced in Fig- 
ure 8.13, except that we have replaced the XOR and XNOR gates from Figure 8.8b with 
their equivalent SOP form. The internal points in Figure 8.13 are labeled from P; to Pig as 
shown. The following subfunctions occur 


P, =X, +X. + X5 


Py =Xq4 

P; =; 

P4 = x3Po 

P5 = x4P3 

Po = Pat Ps 
Py =P 
Pua Ps 

Py = PiPs 
Pio = P7Pg 


We can derive f by tracing the circuit from the output towards the inputs as follows 


f =P 9+ Pio 
= P\Po + P7Ps 


= (x) +x) +.x5)(P4 + Ps) + PiPo 


Ps 


Figure 8.13 — Circuit for Example 8.7. 
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= (xy + x2 +5) (03P2 + .x4P3) + X1X2%5P4P5 


= (xy +X + x5) (x3%4 + X4X3) + Xp HrXs (X3 + Pr) (Hq + Ps) 


= (+ x2 + .X5)(03%4 + 3x4) + X1X2X5 (3 + x4) 4 + 23) 


= XpXZXq + X]X3ZX4 + XQNZX4 + NQNZX4 A NSXZAX4  NSXZX4 


HX X2X5X3X4 +X XQX5X4X3 


This is the same expression as stated in Example 8.3. 
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Circuits based on NAND and NOR gates are slightly more difficult to analyze because each 
gate involves an inversion. Figure 8.14a depicts a simple NAND-gate circuit that illustrates 
the effect of inversions. We can convert this circuit into a circuit with AND and OR gates 
using the reverse of the approach described in Example 8.5. Bubbles that denote inversions 
can be moved, according to DeMorgan’s theorem, as indicated in Figure 8.14b. Then the 
circuit can be converted into the circuit in part (c) of the figure, which consists of AND and 
OR gates. Observe that in the converted circuit, the inputs x3 and x5 are complemented. 
From this circuit the function f is determined as 


f = 1x2 + %3)xX4 + X5 


= XX2xX4 + X3X4 + X5 


It is not necessary to convert a NAND circuit into a circuit with AND and OR gates to 
determine its functionality. We can use the approach from Examples 8.6 and 8.7 to derive 
f as follows. Let P;, P2, and P3 label the internal points as shown in Figure 8.14a. Then 


= P3+Xs 


eS 

| 
Foe ies 
wl on] re 
S| EI 


= Pox4+X%5 = Poxq + Xs 

= Pyx3x4 + X5 = (P| +X3)x4 + Xs 
= (Xpx2 + X3)x4 + Xs 

= (x1X2 + X3)x4 + X5 


= Xj XQX4 + 3X4 + X5 


Example 8.8 
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x) 


Xo 


(a) NAND-gate circuit 


(b) Moving bubbles to convert to ANDs and ORs 


X5 


(c) Circuit with AND and OR gates 


Figure 8.14 — Circuit for Example 8.8. 


Example 8.9 The circuit in Figure 8.15 consists of NAND and NOR gates. It can be analyzed as follows. 
Py = X2x3 
P> = xP; =X\ +P; 


P3 = X3X4 = X3 + X4 


8.2 ANALYSIS OF MULTILEVEL CIRCUITS 


Figure 8.15 — Circuit for Example 8.9. 


P, =P) + P3 
f = Pa+x5 = Paxs 
= P, + P3 “X5 
= (Po + P3)x5 
= (+ Pi + X3 + Xa)Xs 
= (X, + x2x3 + X3 + X4)X5 


= (X + x2 +%3 + X4)X5 


= X1xX5 + XX5 + X3X5 + X4X5 


Note that in deriving the second to the last line, we used property 16a in Section 2.5 to 
simplify x2x3 + x3 into x2 + X3. 

Analysis of circuits is much simpler than synthesis. With a little practice one can 
develop an ability to easily analyze even fairly complex circuits. 
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We have now covered a considerable amount of material on synthesis and analysis of 
logic functions. We have used the Karnaugh map as a vehicle for illustrating the concepts 
involved in finding optimal implementations of logic functions. We have also shown that 
logic functions can be realized in a variety of forms, both with two levels of logic and with 
multiple levels. In a modern design environment, logic circuits are synthesized using CAD 
tools, rather than by hand. The concepts that we have discussed in this chapter are quite 
general; they are representative of the strategies implemented in CAD algorithms. As we 
have said before, the Karnaugh map scheme for representing logic functions is not appro- 
priate for use in CAD tools. In the next section we discuss two alternative representations 
of logic functions, which are suitable for use in CAD algorithms. 
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8.3. ALTERNATIVE REPRESENTATIONS OF LOGIC 
FUNCTIONS 


So far in this book, we have encountered four different forms for representing logic func- 
tions: truth tables, algebraic expressions, Venn diagrams, and Karnaugh maps. Two other 
possibilities are introduced in this section, called cubical notation and binary decision dia- 
grams. 


8.3.1 CUBICAL REPRESENTATION 


Cubical notation represents an n-variable logic function by mapping it onto an n-dimensional 
cube. We can illustrate this representation by first introducing some simple examples of 
cubes. 


Two-Dimensional Cube 

A two-dimensional cube is shown in Figure 8.16. The four corners in the cube are 
called vertices, which correspond to the four rows of a truth table. Each vertex is identified 
by two coordinates. The horizontal coordinate is assumed to correspond to variable x,, and 
vertical coordinate to x2. Thus vertex 00 is the bottom-left corner, which corresponds to 
row O in the truth table. Vertex 01 is the top-left corner, where x; = 0 and x2 = 1, which 
corresponds to row | in the truth table, and so on for the other two vertices. 

We will map a function onto the cube by indicating with blue circles those vertices for 
which f = 1. In Figure 8.16 f = 1| for vertices 01, 10, and 11. We can express the function 
as a Set of vertices, using the notation f = {01, 10, 11}. The function f is also shown in 
the form of a truth table in the figure. 

An edge joins two vertices for which the labels differ in the value of only one variable. 
Therefore, if two vertices for which f = | are joined by an edge, then this edge represents 
that portion of the function just as well as the two individual vertices. For example, f = | 
for vertices 10 and 11. They are joined by the edge that is labeled 1x. It is customary to use 
the letter x to denote the fact that the corresponding variable can be either 0 or 1. Hence 1x 
means that x; = 1, while x2 can be either 0 or 1. Similarly, vertices 01 and 11 are joined 
by the edge labeled x1, indicating that x; can be either 0 or 1, but x2 = 1. The reader must 


Oo x Ul f 
0 0 | 0 
x 0 1 1 
rl Ix to 4 
14 ]4 
aa 
00 10 


Figure 8.16 Representation of f(x, x2) = > m(1, 2, 3). 
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not confuse the use of the letter x for this purpose, in contrast to the subscripted use where 
x, and x2 refer to the variables. 

Two vertices being represented by a single edge is the embodiment of the combining 
property 14a from Section 2.5. The edge 1x is the logical OR of vertices 10 and 11. It 
essentially defines the term x;, which is the sum of minterms x;X2 and x;x2. The property 
14a indicates that 


X1X2 + XjxX2 = Xj 


Therefore, finding edges for which f = 1| is equivalent to applying the combining property. 
Of course, this is also analogous to finding pairs of adjacent cells in a Karnaugh map for 
which f = 1. 

The edges 1x and x1 define fully the function in Figure 8.16; hence we can represent 
the function as f = {1x, x1}. This corresponds to the logic expression 


f =x +x 


which is also obvious from the truth table in the figure. 


Three-Dimensional Cube 

Figure 8.17 illustrates a three-dimensional cube. The x), x2, and x3 coordinates are as 
shown on the left. Each vertex is identified by a specific valuation of the three variables. 
The function f mapped onto the cube is the function from Figure 2.48, which was used in 
Figure 2.52b. There are five vertices for which f = 1, namely, 000, 010, 100, 101, and 
110. These vertices are joined by the five edges shown in blue, namely, x00, 0x0, x10, 1x0, 
and 10x. Because the vertices 000, 010, 100, and 110 include all valuations of x; and x2, 
when x; is 0, they can be specified by the term xx0. This term means that f = 1 if x3 = 0, 
regardless of the values of x; and x2. Notice that xx0 represents the front side of the cube, 
which is shaded in blue. 


0x0 
Xy 101 


A X3 
a 10x 
xy 


Figure 8.17 Representation of f (x1, x2, x3) = > m(0, 2, 4, 5, 6). 
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From the preceding discussion it is evident that the function f can be represented in 
several ways. Some of the possibilities are 


Ff = {000, 010, 100, 101, 110} 
= {0x0, 1x0, 101} 
= {x00, x10, 101} 
= {x00, x10, 10x} 
= {xx0, 10x} 


In a physical realization each of the above terms is a product term implemented by an 
AND gate. Obviously, the minimum-cost circuit is obtained if f = {xx0, 10x}, which is 
equivalent to the logic expression 


f =%3 4+ x1%2 
This is the expression that we derived using the Karnaugh map in Figure 2.52b. 


Four-Dimensional Cube 

Graphical images of two- and three-dimensional cubes are easy to draw. A four- 
dimensional cube is more difficult. It consists of 2 three-dimensional cubes with their 
corners connected. A simple way to visualize a four-dimensional cube is to have one cube 
placed inside the other cube, as depicted in Figure 8.18. We have assumed that the x, x2, 
and x3 coordinates are the same as in Figure 8.17, while x4 = 0 defines the outer cube and 
x4 = 1 defines the inner cube. Figure 8.18 indicates how the function f; of Figure 2.54 
is mapped onto the four-dimensional cube. To avoid cluttering the figure with too many 
labels, we have labeled only those vertices for which f; = 1. Again, all edges that connect 
these vertices are highlighted in blue. 

There are two groups of four adjacent vertices for which f; = | that can be represented 
as planes. The group comprising 0000, 0010, 1000, and 1010 is represented by x0x0. The 
group 0010, 0011, 0110, and 0111 is represented by Ox1x. These planes are shaded in the 
figure. The function f; can be represented in several ways, for example 


Js = {0000, 0010, 0011, 0110, 0111, 1000, 1010, 1111} 
= {00x0, 10x0, 0x10, Ox11, x111} 
= {x0x0, OxIx, x111} 


Since each x indicates that the corresponding variable can be ignored, because it can be 
either 0 or 1, the simplest circuit is obtained if f = {x0x0, Ox1x, x111}, which is equivalent 
to the expression 

Sa = X2X4q + X1x3 + XX3X4 
We derived the same expression in Figure 2.54. 


n-Dimensional Cube 


A function that has n variables can be mapped onto an n-dimensional cube. Although 
it is impractical to draw graphical images of cubes that have more than four variables, it 
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0000 


Figure 8.18 Representation of function f; from Figure 2.54 


is not difficult to extend the ideas introduced above to a general n-variable case. Because 
visual interpretation is not possible and because we normally use the word cube only for 
a three-dimensional structure, many people use the word hypercube to refer to structures 
with more than three dimensions. We will continue to use the word cube in our discussion. 
It is convenient to refer to a cube as being of a certain size that reflects the number of 
vertices in the cube. Vertices have the smallest size. Each variable has a value of 0 or | in 
a vertex. A cube that has an x in one variable position is larger because it consists of two 
vertices. For example, the cube 1x01 consists of vertices 1001 and 1101. A cube that has 
two x’s consists of four vertices, and so on. A cube that has k x’s consists of 2* vertices. 
An n-dimensional cube has 2” vertices. Two vertices are adjacent if they differ in the 
value of only one coordinate. Because there are n coordinates (axes in the n-dimensional 
cube), each vertex is adjacent to n other vertices. The n-dimensional cube contains cubes of 
lower dimensionality. Cubes of the lowest dimension are vertices. Because their dimension 
is zero, we will call them 0-cubes. Edges are cubes of dimension 1; hence we will call them 
1-cubes. A side of a three-dimensional cube is a 2-cube. An entire three-dimensional cube 
is a 3-cube, and so on. In general, we will refer to a set of 2 adjacent vertices as a k-cube. 
From the examples in Figures 8.17 and 8.18, it is apparent that the largest possible 
k-cubes that exist for a given function are equivalent to its prime implicants. In Section 8.4 
we will discuss minimization techniques that use the cubical representation of functions. 
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8.3.2 BINARY DECISION DIAGRAMS 


Binary decision diagrams (BDDs) represent logic functions in the form of a graph. BDDs 
are attractive for use in CAD tools because a logic function can be represented as a BDD 
without requiring a lot of memory space. To illustrate how we can derive a BDD for 
a function f, consider the two-variable truth table shown in Figure 8.19a. A graph that 
corresponds to this truth table is given in part (b) of the figure in the form of a decision tree. 
The circular nodes in this tree represent the input variables x; and x2. The node x; is the root 
of the tree. Each node has two associated edges. For a node x; the edge labeled 0 represents 
x; = 0, and the edge labeled 1 corresponds to x; = 1. The square nodes are the terminal 
nodes of the tree. Each terminal node specifies whether f = 0 or f = 1. Each path from 
the root node to a terminal node corresponds to a row in the truth table. For example, the 
top row of the truth table in Figure 8.19a has x; = x2 = 0 and f = 1. In the decision tree 


xX, %2 f 

0 0 1 

0 1 0 

1 0 1 

1 1 1 

(a) Truth table (b) Decision tree 


(c) Reducing nodes (d) BDD 


Figure 8.19 _ Derivation of a binary decision diagram (BDD). 
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the corresponding path starts at the node x,, follows the edge labeled 0 to the subsequent 
node x2, and then follows the edge labeled 0 to the terminal node 1. 

The decision tree is not an efficient way of representing a logic function, because the 
size of the tree doubles with each input variable. But we can reduce the number of nodes 
in the tree as illustrated in Figure 8.19c. Here, we have merged identical terminal nodes 
such that there is only one terminal node representing each of the values 0 and 1. We can 
then recognize that the node x2 on the righthand side of the tree is redundant and can be 
removed, because both of its edges lead to the same subsequent node. After all identical 
nodes and redundant nodes have been removed from the decision tree, the resulting tree is 
referred to in the technical literature as a binary decision diagram (BDD). A BDD for the 
truth table in Figure 8.19a is given in Figure 8.19d. It provides a compact representation 
of the logic function, and can also be used to provide a low-cost implementation. In Figure 
8.19d we can see that f has the value | if x; = 1 or if x} = 0 and x. = 0. Hence, we 
can write f = x) + X|X2 =x; +X2. In general, a BDD can be used to provide various 
implementations of a function, which may be in either two-level or multilevel form. 

Examples of BDDs for the logical AND and OR functions are given in Figure 8.20. 
Part (a) of the figure represents the function f = xx, and part (b) represents f = x, + xp. It 
is easy to extend these BDDs to represent larger AND and OR functions. Figure 8.21 shows 
how we can derive BDDs for XOR functions. Part (a) gives a decision tree for the function 
f = x1 ® x2. When identical terminal nodes in the decision tree are merged, no redundant 
nodes are created. The resulting BDD is shown in part (b) of the figure. Part (c) shows 
how this BDD can be extended to represent the three-input XOR function f = x; @ x2 © x3. 
Note that the terminal node 1| is reached from the root node x; by following paths on which 
the total number of edges labeled 1 is odd, as required for the XOR function. 


Derivation of BDDs using Shannon’s Expansion 

As shown in Figures 8.19 and 8.21, itis possible to derive a BDD by removing redundant 
nodes and merging identical nodes in a decision tree. Although this method can be used 
for any logic function, it is tedious when the function has more than a few input variables. 


(a) AND (b) OR 


Figure 8.20 —BDDs for the AND and OR functions. 
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(a) Decision tree (b) BDD 


(c) 3-input BDD 


Figure 8.21 Derivation of BDDs for XOR functions. 


Another way of deriving a BDD is to make use of Shannon’s expansion theorem, which 
we introduced in Section 4.1.2. Recall that Shannon’s expansion states that any Boolean 
function f (x1, X2,...,Xj,---,%Xn) can be written in the form 


SF 1, X20, 6 Xi, Xn) = Ke, + Xi fy 


where f, = f (1, X%2,...,0,...,%,) and fi, = f (x1, x2,..., 1,..., Xn). 

Consider the function f = x; + x2x3. Shannon’s expansion with respect to the input 
x, gives f = x1fe, + x1fc,, which can be represented using the graph in Figure 8.22a. The 
cofactors f;, and 7, can be evaluated as follows: 
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(a) Expansion using x, 


(c) Expansion using x, (d) BDD ordered x5, x), x3 


Figure 8.22 Derivation of BDDs for f = x, + x2x3. 


Fy, = OF x2x3 = x2x3 
te =1l+mxn=1 
We can now derive our BDD as shown in Figure 8.22b. The subfunction ff, = «2x3, which 


is highlighted in a blue color, uses the BDD for the two-input AND function that we showed 
in Figure 8.20a. 
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A useful property of BDDs is that for a given order of the input variables there is only 
one possible BDD. Using the terminology from Chapter 2, this means that for a given order 
of the inputs a BDD is a canonical representation of the function. However, we should 
consider the effect on a BDD if we change the order of variables. Figure 8.22c corresponds 
to the Shannon’s expansion of f using the order x2, x1, x3. Now, node x is the root of the 
tree and its edges point to the cofactors of f with respect to x.. We have 


fe =X +0-%3 = x1 
fo =X + 1-43 = x1 +3 


These expressions lead to the BDD in Figure 8.22d. The edge labeled 0 from node x2 points 
to the subfunction x;. The edge labeled 1 from node x2 points to the subfunction x; + x3, 
which is highlighted in blue. This subfunction uses the same BDD that we showed for the 
two-input OR function in Figure 8.20. 

Comparing the BDDs in parts (b) and (d) of Figure 8.22, we see that the complexity 
of a BDD for a given function can be affected by the order of variables. In Figure 8.22b 
there are five nodes in the BDD, while in Figure 8.22d there are six nodes. CAD tools that 
implement BDDs usually employ heuristic techniques which attempt to find a good order 
of the input variables when deriving BDDs. 


Example 8.10 Assume that we are given the BDD in Figure 8.22d, and that we wish to determine the effect 
on this BDD of swapping the positions of nodes x; and x2, so that x, becomes the root of the 
BDD. The swapping operation can be performed as indicated in Figure 8.23. The first step 


Xp Xp Node 
0 O 0 
0 1 % 
1 O 1 
1 1 1 
(a) BDD ordered xy, x1, x3 (b) Truth table (c) Order x1, x2, +3 


Figure 8.23 Reordering the nodes in a BDD. 
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is to isolate the nodes x; and x2 from the rest of the BDD, as shown by the dotted blue line 
in Figure 8.23a. Then, we enumerate the valuations of the inputs x; and x2 on each path 
that crosses this boundary, and determine which successor node is reached. The truth table 
in Figure 8.23b provides the required information. It shows that the path with x;x. = 00 
leads to the terminal node 0, the path with x;x2 = 01 leads to node x3, and the remaining 
paths lead to the terminal node 1. We can now create the diagram shown in part (c) of the 
figure, ensuring that all combinations of x,x, lead to the correct successor nodes. We now 
remove the redundant node x2 on the righthand side of the diagram, leading to the BDD in 
Figure 8.22b. 
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Consider the SOP expression f = x13 + x1%4 + X2X4 + X2xX3 + X1X2x3x4. We wish to create 
a BDD for this function with the input order x,, x2, x3, x4. The first step is to use Shannon’s 
expansion with respect to input x;, which is f = x) fe, + xif¢,. The cofactors are 


Se, = 0+ x3 +0 x4 + x2%4 + X2K3 + 1 + XoX3K4 
= X0X4 + X2K3 + X2X3X4 
te =1-xg + 1+ x4 + x9x4 + 2x3 + 0 + X203%4 


= X3 + X4 + X2X4 + X2X3 = 3 + X4 


For the cofactor f,, = x3 + x4 we can use the BDD from Figure 8.20b. But for the cofactor 
Se, = X2X4 + X2x3 + X2x3.x4 it is useful to apply Shannon’s expansion again, with respect to 
input x2. This gives 


fim =0-x4 +0- x3 4+ 1 + x3%4 = 23X45 
Fi = Le xg t+ 1-3 +0 + x3%4 = 13 + x4 


Now, we can use the BDDs in parts (a) and (b) of Figure 8.20 for the cofactors f;,z, and f;,.,. 
respectively. The above steps lead to the diagram in Figure 8.24a, where the BDD for the 
subfunction x3 + x4 is highlighted in blue. This subfunction is used for both the cofactor 
Sx, Which is reached when x; = 1, and the cofactor f;,,,, which is reached when x; = 0 and 
x2 = 1. The BDD for the subfunction x3x,4 is shown on the lefthand side of Figure 8.24a. 
This subfunction is used for the cofactor f;,z, and is reached when x; = x2 = 0. 

Merging the identical terminal nodes in the graph in Figure 8.24a, and also merging 
the two identical copies of node x4, leads to the BDD shown in Figure 8.24b. Since this 
BDD specifies that f is defined by the subfunction x3 + x4 when x; = | or x2 = 1, and 
f is defined by the subfunction x3x4 when x; = x2 = 0, then we can write the multilevel 
expression 


f = (1 +2) + (3 + x4) + 1X2 (13.4) 


It is easy to see how this expression relates to the SOP form above. 


Example 8.11 
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(a) Diagram (b) BDD 


Figure 8.24 _ Derivation of a BDD for the function in Example 8.11. 


Practical use of BDDs 

We have presented only a few introductory concepts related to BDDs, in an attempt 
to give the reader an appreciation for this method of representing logic functions. More 
detailed information can be found in various references, including [7—10]. The BDD concept 
has been implemented in a number of software programs that are available through the 
internet. These programs are often referred to as BDD packages. They provide many 
useful functions, including the ability to create BDDs for logic functions, reorder the inputs 
to optimize these BDDs, and perform logical operations using BDDs. Although a detailed 
discussion of these BDD packages is beyond the scope of this book, the user is encouraged to 
search for available packages and experiment with the provided functionality. Two popular 
packages that are often included as part of CAD tools for logic synthesis are called BuDDy 
[11] and CUDD [12]. 


8.4 OPTIMIZATION TECHNIQUES BASED ON CUBICAL 
REPRESENTATION 


In this section we introduce optimization techniques that utilize the cubical representation 
for logic functions. Cubical representation is well suited for implementation of minimiza- 
tion algorithms that can be programmed and run efficiently on computers. Such algorithms 
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are included in modern CAD tools. While the CAD tools can be used effectively without 
detailed knowledge of how their minimization algorithms are implemented, the reader may 
find it interesting to gain some insight into how this may be accomplished. 


8.4.1 A TABULAR METHOD FOR MINIMIZATION 


A tabular approach for minimization was proposed in the 1950s by Willard Quine [5] and 
Edward McCluskey [6]. It became popular under the name Quine-McCluskey method. 
While it is not efficient enough to be used in modern CAD tools, it is a simple method 
that illustrates the key issues. We will present it using the cubical notation discussed in 
Section 8.3. 


Generation of Prime Implicants 

As mentioned in Section 8.3, the prime implicants of a given logic function f are the 
largest possible k-cubes for which f = 1. For incompletely specified functions, which 
include a set of don’t-care vertices, the prime implicants are the largest k-cubes for which 
either f = | or f is unspecified. 

Assume that the initial specification of f is given in terms of minterms for which f = 1. 
Also, let the don’t-cares be specified as minterms. This allows us to create a list of vertices 
for which either f = 1 or it is a don’t-care condition. We can compare these vertices in 
pairwise fashion to see if they can be combined into larger cubes. Then we can attempt to 
combine these new cubes into still larger cubes and continue the process until we find the 
prime implicants. 

The basis of the method is the combining property of Boolean algebra 


XiXj + XX; = Xj 


If we have two cubes that are identical in all variables (coordinates) except one, for which 
one cube has the value 0 and the other has 1, then these cubes can be combined into a larger 
cube. For example, consider f (x1, ...,x4) = {1000, 1001, 1010, 1011}. The cubes 1000 
and 1001 differ only in variable x4; they can be combined into a new cube 100x. Similarly, 
1010 and 1011 can be combined into 101x. Then we can combine 100x and 101x into a 
larger cube 10xx, which means that the function can be expressed simply as f = x1X2. 

Figure 8.25 shows how we can generate the prime implicants of the function, f, in 
Figure 2.58. The function is defined as 


f@,...,x4) = )_ m0, 4, 8, 10, 11, 12, 13, 15) 


There are no don’t-care conditions. Since larger cubes can be generated only from the 
minterms that differ in just one variable, we can reduce the number of pairwise comparisons 
by placing the minterms into groups such that the cubes in each group have the same number 
of 1s, and sort the groups by the number of Is. Thus, it will be necessary to compare 
each cube in a given group only with all cubes in the immediately preceding group. In 
Figure 8.25, the minterms are ordered in this way in list 1. (Note that we indicated the 
decimal equivalents of the minterms as well, to facilitate our discussion.) The minterms, 
which are also called 0-cubes as explained in Section 8.3, can be combined into 1-cubes 
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List 1 List 2 List 3 


i i i CNS 


Figure 8.25 — Generation of prime implicants for the function in Figure 2.58. 


shown in list 2. To make the entries easily understood we indicated the minterms that are 
combined to form each 1-cube. Next, we check if the 0-cubes are included in the 1-cubes 
and place a check mark beside each cube that is included. We now generate 2-cubes from 
the 1-cubes in list 2. The only 2-cube that can be generated is xx00, which is placed in 
list 3. Again, the check marks are placed against the 1-cubes that are included in the 2-cube. 
Since there exists just one 2-cube, there can be no 3-cubes for this function. The cubes in 
each list without a check mark are the prime implicants of f. Therefore, the set, P, of prime 
implicants is 


P = {10x0, 101x, 110x, 1x11, 11x1, xx00} 
= {P1, P2, P3, Pa, Ps, Pot 


Determination of a Minimum Cover 


Having generated the set of all prime implicants, it is necessary to choose a minimum- 
cost subset that covers all minterms for which f = 1. As a simple measure we will assume 
that the cost is directly proportional to the number of inputs to all gates, which means to 
the number of literals in the prime implicants chosen to implement the function. 

To find a minimum-cost cover, we construct a prime implicant cover table in which there 
is a row for each prime implicant and a column for each minterm that must be covered. 
Then we place check marks to indicate the minterms covered by each prime implicant. 
Figure 8.26a shows the table for the prime implicants derived in Figure 8.25. If there is a 
single check mark in some column of the cover table, then the prime implicant that covers 
the minterm of this column is essential and it must be included in the final cover. Such 
is the case with pe, which is the only prime implicant that covers minterms 0 and 4. The 
next step is to remove the row(s) corresponding to the essential prime implicants and the 
column(s) covered by them. Hence we remove p¢ and columns 0, 4, 8, and 12, which leads 
to the table in Figure 8.26b. 
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Prime Minterm 
implicant 0 4 8 10 11 


(a) Initial prime implicant cover table 


Prime eae 
implicant 10 


= 


(b) After the removal of essential prime implicants 


Prime Minterm 
implicant 10 11 13 


(c) After the removal of dominated rows 


Figure 8.26 — Selection of a cover for the function in Figure 2.58. 


Now, we can use the concept of row dominance to reduce the cover table. Observe that 
pi covers only minterm 10 while pz covers both 10 and 11. We say that p2 dominates p,. 
Since the cost of p2 is the same as the cost of p1, it is prudent to choose p2 rather than p,, 
so we will remove p; from the table. Similarly, p; dominates p3, hence we will remove 
p3 from the table. Thus, we obtain the table in Figure 8.26c. This table indicates that we 
must choose pz to cover minterm 10 and ps to cover minterm 13, which also takes care of 
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covering minterms 11 and 15. Therefore, the final cover is 


C = {po, ps, Po} 
= {101x, I11x1, xx00} 


which means that the minimum-cost implementation of the function is 
f = X1X2x3 + X1X2X4 + H3X4 


This is the same expression as the one derived from Figure 2.58. 

In this example we used the concept of row dominance to reduce the cover table. We 
removed the dominated rows because they cover fewer minterms and the cost of their prime 
implicants is the same as the cost of the prime implicants of the dominating rows. However, 
a dominated row should not be removed if the cost of its prime implicant is less than the 
cost of the dominating row’s prime implicant. 

The tabular method can be used with don’t-care conditions as illustrated in the following 
example. 


Example 8.12 


The don’t-care minterms are included in the initial list in the same way as the minterms for 
which f = 1. Consider the function 


f (x1, .-..%4) = }) m0, 2, 5,6, 7, 8, 9, 13) + D(A, 12, 15) 


We encourage the reader to derive a Karnaugh map for this function as an aid in visual- 
izing the derivation that follows. Figure 8.27 depicts the generation of prime implicants, 
producing the result 


P = {00x0, 0x10, O11x, xOOx, xx01, 1xOx, x1x1} 
= {P1, P2, P3, Pa; Ps, Po, P7} 


The initial prime implicant cover table is shown in Figure 8.28a. The don’t-care 
minterms are not included in the table because they do not have to be covered. There are no 
essential prime implicants. Examining this table, we see that column 8 has check marks in 
the same rows as column 9. Moreover, column 9 has an additional check mark in row ps. 
Hence column 9 dominates column 8. We refer to this as the concept of column dominance. 
When one column dominates another, we can remove the dominating column, which is 
column 9 in this case. Note that this is in contrast to rows where we remove dominated 
(rather than dominating) rows. The reason is that when we choose a prime implicant to 
cover the minterm that corresponds to the dominated column, this prime implicant will 
also cover the minterm corresponding to the dominating column. In our example, choosing 
either p4 or Pp covers both minterms 8 and 9. Similarly, column 13 dominates column 5, 
hence column 13 can be deleted. 

After removing columns 9 and 13, we obtain the reduced table in Figure 8.28). In 
this table row p4 dominates pe and row p7 dominates ps. This means that ps and pe, can be 
removed, giving the table in Figure 8.28c. Now, p4 and p7 are essential to cover minterms 
8 and 5, respectively. Thus, the table in Figure 8.28d is obtained, from which it is obvious 
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List 2 List 3 


0 0,1,8,9 x00x 
0 
1 
0 
0 
1 
0 
0 


1,5,9,13 x x01 
8,9,12,13 1xOx 


0 
0 
0 


5,7,13,15 |] x 1x 1 


a 
< <<< A S\0UCUM 


<< <<< 


Figure 8.27 — Generation of prime implicants for the function in Example 8.12. 


that p2 covers the remaining minterms 2 and 6. Note that row p2 dominates both rows p; 
and p3. 
The final cover is 


C= {p2, P4; p7} 
= {0x10, x00x, x1x1} 


and the function is implemented as 


f = X1x3X4 + X2X3 + X2X4 


In Figures 8.26 and 8.28, we used the concept of row and column dominance to reduce 
the cover table. This is not always possible, as illustrated in the following example. 


Consider the function Example 8.13 
fQ1,.--,%4) = Smo, 3, 10, 15) + DC, 2, 7, 8, 11, 14) 
The prime implicants for this function are 
P = {00xx, x0xO, xO1x, xx11, 1x1x} 
= {P1- P2, P3, Pa, Ps} 
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Prime Minterm 
implicant 0 2 5 6 7 


Prime Minterm 
implicant 2 6 


Py Vv 
P2 vov 
P3 Vv 


(d) After including p4 and p7 
(c) After the removal of rows ps and Po in the cover 


Figure 8.28 Selection of a cover for the function in Example 8.12. 
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Prime 
implicant 


(a) Initial prime implicant cover table 


Prime aes 
implicant 


(b) After including p3 in the cover 


Prime ae 
implicant 


(c) After excluding p3 from the cover 


Figure 8.29 — Selection of a cover for the function in 
Example 8.13. 


The initial prime implicant cover table is shown in Figure 8.29a. There are no essential prime 
implicants. Also, there are no dominant rows or columns. Moreover, all prime implicants 
have the same cost because each of them is implemented with two literals. Thus, the table 
does not provide any clues that can be used to select a minimum-cost cover. 

A good practical approach is to use the concept of branching. We can choose any prime 
implicant, say p3, and first choose to include this prime implicant in the final cover. Then 
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we can determine the rest of the final cover in the usual way and compute its cost. Next we 
try the other possibility by excluding p3 from the final cover and determine the resulting 
cost. We compare the costs and choose the less expensive alternative. 

Figure 8.29b gives the cover table that is left if p3 is included in the final cover. The 
table does not include minterms 3 and 10 because they are covered by p3. The table indicates 
that a complete cover must include either p, or py to cover minterm 0 and either py, or ps to 
cover minterm 15. Therefore, a complete cover can be 


C = {pi, p3, pa} 


The alternative of excluding p3 leads to the cover table in Figure 8.29c. Here, we see that 
a minimum-cost cover requires only two prime implicants. One possibility is to choose p, 
and ps. The other possibility is to choose pz and ps. Hence a minimum-cost cover is just 


Crnin = {p1, ps} 
= {00xx, Ix1x} 


The function is realized as 


Sf = 1X2 + x13 


Summary of the Tabular Method 
The tabular method can be summarized as follows: 


1. Starting with a list of cubes that represent the minterms where f = 1 or a don’t-care 
condition, generate the prime implicants by successive pairwise comparisons of the 
cubes. 


2. Derive a cover table which indicates the minterms where f = 1 that are covered by 
each prime implicant. 


3. Include the essential prime implicants (if any) in the final cover and reduce the table 
by removing both these prime implicants and the covered minterms. 


4. Use the concept of row and column dominance to reduce the cover table further. A 
dominated row is removed only if the cost of its prime implicant is greater than or 
equal to the cost of the dominating row’s prime implicant. 


5. Repeat steps 3 and 4 until the cover table is either empty or no further reduction of 
the table is possible. 


6. If the reduced cover table is not empty, then use the branching approach to determine 
the remaining prime implicants that should be included in a minimum cost cover. 


The tabular method illustrates how an algebraic technique can be used to generate the 
prime implicants. It also shows a simple approach for dealing with the covering problem, 
to find a minimum-cost cover. The method has some practical limitations. In practice, 
functions are seldom defined in the form of minterms. They are usually given either in the 
form of algebraic expressions or as sets of cubes. The need to start the minimization process 
with a list of minterms means that the expressions or sets have to be expanded into this 
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form. This list may be very large. As larger cubes are generated, there will be numerous 
comparisons performed and the computation will be slow. Using the cover table to select 
the optimal set of prime implicants is also computationally intensive when large functions 
are involved. 

Many algebraic techniques have been developed, which aim to reduce the time that it 
takes to generate the optimal covers. While most of these techniques are beyond the scope 
of this book, we will briefly discuss one possible approach in the next section. A reader who 
intends to use the CAD tools, but is not interested in the details of automated minimization, 
may skip this section without loss of continuity. 


8.4.2 A CUBICAL TECHNIQUE FOR MINIMIZATION 


Assume that the initial specification of a function f is given in terms of implicants that are not 
necessarily either minterms or prime implicants. Then it is convenient to define an operation 
that will generate other implicants that are not given explicitly in the initial specification, 
but which will eventually lead to the prime implicants of f. One such possibility is known 
as the *-product operation, which is usually pronounced the “star-product” operation. We 
will refer to it simply as the *-operation. 


*-Operation 

The -operation provides a simple way of deriving a new cube by combining two cubes 
that differ in the value of only one variable. Let A = A,A2---A, and B = B,B,---B, be 
two cubes that are implicants of an n-variable function. Thus each coordinate A; and B; 
is specified as having the value 0, 1, or x. There are two distinct steps in the +-operation. 
First, the *«-operation is evaluated for each pair A; and B;, in coordinates i = 1,2,...,n, 
according to the table in Figure 8.30. Then based on the results of using the table, a set of 
rules is applied to determine the overall result of the +-operation. The table in Figure 8.30 
defines the coordinate «-operation, A; * B;. It specifies the result of A; + B; for each possible 
combination of values of A; and B;. This result is the intersection (i.e., the common part) 
of A and B in this coordinate. Note that when A; and B; have the opposite values (0 and 1, 
or vice versa), the result of the coordinate «-operation is indicated by the symbol ¢. We say 
that the intersection of A; and B; is empty. Using the table, the complete «-operation for A 
and B is defined as follows: 


C =AxB, such that 


A; * B 


Se ee BO 
So 


0 
0 
p 
0 


Figure 8.30 — The coordinate «-operation. 
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1. C=6if A; * B; = @ for more than one i. 


2. Otherwise, C; = A; * B; when A; « B; 4 6, and C; = x for the coordinate where 
Aj * B; = @. 


For example, let A = {0x0} and B = {111}. Then A; x B} =O*x 1=¢, Ar. * By = xx 1 = 
1, and A3 * Bz; =0*x1=4¢. Because the result is g in two coordinates, it follows from 
condition 1 that A* B=. In other words, these two cubes cannot be combined into 
another cube, because they differ in two coordinates. 

As another example, considerA = {11x} and B = {10x}. InthiscaseA; * B} =1*1= 
1, Ao * By = 1 * 0 =@, and A3 * Bz = x * x =x. According to condition 2 above, C; = 1, 
C2 = x, and C3 = x, which gives C = A « B = {1xx}. A larger 2-cube is created from two 
1-cubes that differ in one coordinate only. 

The result of the «-operation may be a smaller cube than the two cubes involved in the 
operation. Consider A = {1x1} and B = {11x}. Then C=Ax B= {111}. Notice that C 
is included in both A and B, which means that this cube will not be useful in searching for 
prime implicants. Therefore, it should be discarded by the minimization algorithm. 

As a final example, consider A = {x10} and B = {0x1}. Then C = A x B = {01x}. All 
three of these cubes are the same size, but C is not included in either A or B. Hence C has 
to be considered in the search for prime implicants. The reader may find it helpful to draw 
a Karnaugh map to see how cube C is related to cubes A and B. 


Using the «-Operation to Find Prime Implicants 

The essence of the »-operation is to find new cubes from pairs of existing cubes. In 
particular, it is of interest to find new cubes that are not included in the existing cubes. A 
procedure for finding the prime implicants may be organized as follows. 

Suppose that a function f is specified by means of a set of implicants that are represented 
as cubes. Let this set be denoted as the cover C* of f. Let c! and c/ be any two cubes in 
C*. Then apply the *-operation to all pairs of cubes in C*; let G‘*! be the set of newly 
generated cubes. Hence 


Ge! = cl «xc! forall c’, le CK 


Now a new cover for f may be formed by using the cubes in C* and G‘+!. Some of these 
cubes may be redundant because they are included in other cubes; they should be removed. 
Let the new cover be 


c+! = ck UG! — redundant cubes 


where U denotes the logical union of two sets, and the minus sign (—) denotes the removal 
of elements of a set. If C'+! 4 C*, then a new cover C'*? is generated using the same 
process. If C’+! = C*, then the cubes in the cover are the prime implicants of f. For an 
n-variable function, it is necessary to repeat the step at most n times. 

Redundant cubes that have to be removed are identified through pairwise comparison 
of cubes. Cube A = A,A2---A, should be removed if it is included in some cube B = 
B,B,---B,, which is the case if A; = B; or B; = x for every coordinate i. 
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Consider the function f (x1, x2, x3) of Figure 2.56. Assume that f is initially specified as 
a set of vertices that correspond to the minterms, mo, m1, m2, m3, and m7. Hence let the 
initial cover be C° = {000, 001, 010, O11, 111}. Using the +-operation to generate a new 
set of cubes, we obtain G! = {00x, 0x0, 0x1, 01x, x11}. Then C! = C° UG! - redundant 
cubes. Observe that each cube in C° is included in one of the cubes in G!; therefore, all 
cubes in C° are redundant. Thus C! = G!. 

The next step is to apply the +-operation to the cubes in C!, which yields G? = {000, 
001, Oxx, 0x1, 010, 01x, 011}. Note that all of these cubes are included in the cube Oxx; 
therefore, all but Oxx are redundant. Now it is easy to see that 


C? = C! UG? — redundant terms 
= {xl11, Oxx} 
since all cubes of C!, except x11, are redundant because they are covered by Oxx. 
Applying the *-operation to C? yields G? = {011} and 
C? = C* UG? - redundant terms 
= {xl1, Oxx} 
Since C* = C?, the conclusion is that the prime implicants of f are the cubes {x11, Oxx}, 
which represent the product terms x2x3 and x;. This is the same set of prime implicants that 
we derived using a Karnaugh map in Figure 2.56. 
Observe that the derivation of prime implicants in this example is similar to the tabular 


method explained in Section 8.4.1 because the starting point was a function, f, given as a 
set of minterms. 


Example 8.14 


As another example, consider the four-variable function of Figure 2.57. Assume that this 
function is initially specified as the cover C= {0101, 1101, 1110, O11x, xOIx}. Then 
successive applications of the +-operation and removing the redundant terms gives 


C! = {x0lx, x101, 01x1, x110, 1x10, Ox1x} 
C= {xO1x, x101, O1x1, Ox1x, xx10} 
C=C 


Therefore, the prime implicants are X2x3, x2%3%4, X1X2X4, X1X3, and x3X4. 


Example 8.15 


Determination of Essential Prime Implicants 

From a cover that consists of all prime implicants, it is necessary to extract a minimal 
cover. As we saw in Section 2.12.2, all essential prime implicants must be included in the 
minimal cover. To find the essential prime implicants, it is useful to define an operation 
that determines a part of a cube (implicant) that is not covered by another cube. One such 
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operation is called the #-operation (pronounced the “sharp operation’), which is defined as 
follows. 


#-Operation 

Again, let A = A;A2---A, and B = B,B---B, be two cubes (implicants) of an n- 
variable function. The sharp operation A#B leaves as a result “that part of A that is not 
covered by B.” Similar to the «-operation, the #-operation has two steps: A;#B; is evaluated 
for each coordinate i, and then a set of rules is applied to determine the overall result. 
The sharp operation for each coordinate is defined in Figure 8.31. After this operation is 
performed for all pairs (A;, B;), the complete #-operation is defined as follows: 


C = A#B, such that 


. C=A if A;#B; = ¢ for some i. 
2. C=@if A;#B; = « for all i. 


Otherwise, C = U(At ,Ao,..., By,...,An) , where the union is for all i for which 
A; = x and B; # x. 


The first condition corresponds to the case where cubes A and B do not intersect at all; 
namely, A and B differ in the value of at least one variable, which means that no part of A 
is covered by B. For example, let A = Ox1 and B = 11x. The coordinate #-products are 
A,#B, = 6, Ax#B, = 0, and A3#B3 = e. Then from rule | it follows that 0x1 # 11x = Ox1. 
The second condition reflects the case where A is fully covered by B. For example, Ox1 
# Oxx = @. The third condition is for the case where only a part of A is covered by B. In 
this case the #-operation generates one or more cubes. Specifically, it generates one cube 
for each coordinate i that is x in A;, but is not x in B;. Each cube generated is identical to 
A, except that A; is replaced by B;. For example, Oxx # 01x = 00x, and Oxx #010 ={00x, 
Ox1}. 

We will now show how the #-operation can be used to find the essential prime impli- 
cants. Let P be the set of all prime implicants of a given function f. Let p' denote one prime 
implicant in the set P and let DC denote the don’t-care vertices for f. (We use superscripts 
to refer to different prime implicants in this section because we are using subscripts to refer 
to coordinate positions in cubes.) Then p’ is an essential prime implicant if and only if 


p' #(P—p')#DC #9 


0 E 1) € A; #B; 
1 pee 
x 0 € 


Figure 8.31 — The coordinate #-operation. 
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This means that p! is essential if there exists at least one vertex for which f = 1 that is 
covered by p’, but not by any other prime implicant. The #-operation is also performed with 
the set of don’t-care cubes because vertices in p! that correspond to don’t-care conditions 
are not essential to cover. The meaning of p' # (P — p’) is that the #-operation is applied 
successively to each prime implicant in P. For example, consider P = {p!, p’, p*, p+} and 
DC = {d ld 2). To check whether P is essential, we evaluate 


((((p) #p!) #p) #p*) #d') #d? 


If the result of this expression is not ¢, then p? is essential. 
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In Example 8.14 we determined that the cubes x11 and Oxx are the prime implicants of Example 8.16 


the function f in Figure 2.56. We can discover whether each of these prime implicants is 
essential as follows 


xll #Oxx= 1114 @ 
Oxx # x11 = {00x, 0x0} 4 @ 
The cube x11 is essential because it is the only prime implicant that covers the vertex 


111, for which f = 1. The prime implicant Oxx is essential because it is the only one that 
covers the vertices 000, 001, and 010. This can be seen in the Karnaugh map in Figure 2.56. 


In Example 8.15 we found that the prime implicants of the function in Figure 2.57 are P = 
{xO1x, x101, O1x1, Ox1x, xx10}. Because this function has no don’t-cares, we compute 


x01x #(P —x01x) = 10114 ¢ 


This is computed in the following steps: xO1x # x101 = x01x, then xO1x # O1x1 = xOIx, 
then xO1x # Ox1x = 101x, and finally 101x # xx10 = 1011. Similarly, we obtain 


xl01 #(P-x101) = 11014 @ 

Olxl #(P-Olxl)=@ 

Oxlx # (P —- Ox1x) = @ 

xx 10 # (P —xx10) = 11104 @ 
Therefore, the essential prime implicants are x01x, x101, and xx10 because they are the 
only ones that cover the vertices 1011, 1101, and 1110, respectively. This is obvious from 
the Karnaugh map in Figure 2.57. 

When checking whether a cube A is essential, the #-operation with one of the cubes in 


P — A may generate multiple cubes. If so, then each of these cubes has to be checked using 
the #-operation with all of the remaining cubes in P — A. 


Example 8.17 
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Complete Procedure for Finding a Minimal Cover 


Having introduced the - and #-operations, we can now outline a complete procedure 
for finding a minimal cover for any n-variable function. Assume that the function f is 
specified in terms of vertices for which f = 1; these vertices are often referred to as the 
ON-set of the function. Also, assume that the don’t-care conditions are specified as a DC-set. 
Then the initial cover for f is a union of the ON and DC sets. 

Prime implicants of f can be generated using the *-operation, and the #-operation can 
be used to find the essential prime implicants. If the essential prime implicants cover the 
entire ON-set, then they form the minimum-cost cover for f. Otherwise, it is necessary to 
include other prime implicants until all vertices in the ON-set are covered. 

A nonessential prime implicant p' should be deleted if there exists a less-expensive 
prime implicant p’ that covers all vertices of the ON-set that are covered by p’. If the 
remaining nonessential prime implicants have the same cost, then a possible heuristic ap- 
proach is to arbitrarily select one of them, include it in the cover, and determine the rest of 
the cover. Then an alternative cover is generated by excluding this prime implicant, and 
the lower-cost cover is chosen for implementation. We already used this approach, which 
is often referred to as the branching heuristic, in Sections 2.12.2 and 8.4.1. 

The preceding discussion can be summarized in the form of the following minimization 
procedure: 


1. Let C? = ON UDC be the initial cover of function f and its don’t-care conditions. 


2. Find all prime implicants of C° using the +-operation; let P be this set of prime 
implicants. 


3. Find the essential prime implicants using the #-operation. A prime implicant p’ is 
essential if p' # (P — p') # DC # g. If the essential prime implicants cover all 
vertices of the ON-set, then these implicants form the minimum-cost cover. 

4. Delete any nonessential p' that is more expensive (i.e., a smaller cube) than some 
other prime implicant p/ if p' # DC # p/ = 9. 

5. Choose the lowest-cost prime implicants to cover the remaining vertices of the 


ON-set. Use the branching heuristic on the prime implicants of equal cost and retain 
the cover with the lowest cost. 


Example 8.18 


To illustrate the minimization procedure, we will use the function 
SF (1, X2,.X3, X4,X5) = > m0, 1,4, 8, 13, 15, 20, 21, 23, 26, 31) + D(5, 10, 24, 28) 


To help the reader follow the discussion, this function is also shown in the form of a 
Karnaugh map in Figure 8.32. 

Instead of f being specified in terms of minterms, let us assume that f is given as the 
following SOP expression 


f =X X3X4NXs5 + XXX 3XgXs + XX2N3X4N5 + AXpXQN3Ns + NpXQNZXs + Xp AZXgXs + XQX3X 4X5 


8.4 OPTIMIZATION TECHNIQUES BASED ON CUBICAL REPRESENTATION 


Figure 8.32 The function for Example 8.18. 


Also, we will assume that don’t-cares are specified using the expression 


DC = x1 x2%4X5 +X XH3X4Xs + Xp XX3X4X5 
Thus, the ON-set expressed as cubes is 
ON = {0x000, 11010, 00001, 011x1, 101x1, 1x111, x0100} 
and the don’t-care set is 
DC = {11x00, 01010, 00101} 


The initial cover C° consists of the ON-set and the DC-set: 


C° = {0x000, 11010, 00001, 011x1, 101x1, 1x111, x0100, 11x00, 01010, 00101} 
Using the *-operation, the subsequent covers obtained are 


C! = {0x000, O11x1, 101x1, 1x111, x0100, 11x00, 0000x, 00x00, x1000, 010x0, 110x0, 
x1010, 00x01, x1111, 0x101, 1010x, x0101, 1x100, 0010x} 

C? = {0x000, O11x1, 1OLx1, 1x11, 11x00, x1111, 0x101, 1x100, x010x, 00x0x, x10x0} 

C=C 


Therefore, P = C?. 

Using the #-operation, we find that there are two essential prime implicants: 00x0x 
(because it is the only one that covers the vertex 00001) and x10x0 (because it is the only one 
that covers the vertex 11010). The minterms of f covered by these two prime implicants 
are m(0, 1, 4, 8, 26). 

Next, we find that 1x100 can be deleted because the only ON-set vertex that it covers is 
10100 (7729), which is also covered by x010x and the cost of this prime implicant is lower. 
Note that having removed 1x100, the prime implicant x010x becomes essential because 
none of the other remaining prime implicants covers the vertex 10100. Therefore, x010x 
has to be included in the final cover. It covers m(20, 21). 
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There remains to find prime implicants to cover m(13, 15, 23, 31). Using the branching 
heuristic, the lowest-cost cover is obtained by including the prime implicants 011x1 and 
1x111. Thus the final cover is 


Cninimum = {00x0x, x10x0, x010x, O11x1, Ix111} 


The corresponding sum-of-products expression is 


S =X X2X4q + X2H3X5 + N2NZX4q + Xp HIAZAs + Xj A3X4X5 


Although this procedure is tedious when performed by hand, it is not difficult to write a 
computer program to implement the algorithm automatically. The reader should check 
the validity of our solution by finding the optimal realization from the Karnaugh map in 
Figure 8.32. 


8.4.3 PRACTICAL CONSIDERATIONS 


The purpose of the above discussion was to give the reader some idea about how mini- 
mization of logic functions may be automated for use in CAD tools. We chose a scheme 
that is not too difficult to explain. From the practical point of view, this scheme has some 
drawbacks. The main difficulty is that the number of cubes that must be considered in the 
process can be extremely large. 

If the goal of minimization is relaxed so that it is not imperative to find a minimum-cost 
implementation, then it is possible to derive heuristic techniques that produce good results 
in reasonable time. A technique of this type forms the basis of the widely used Espresso 
program, which is available from the University of California at Berkeley via the World 
Wide Web. Espresso is a two-level optimization program. Both input to the program and 
its output are specified in the format of cubes. Instead of using the »-operation to find the 
prime implicants, Espresso uses an implicant-expansion technique. (See Problem 8.25 for 
an illustration of the expansion of implicants.) A comprehensive explanation of Espresso 
is given in [14], while simplified outlines can be found in [2, 13]. 

The University of California at Berkeley has provided software programs that can be 
used for design of multilevel circuits, called MIS [15], SIS [16], and ABC [17]. They allow 
a user to apply various multilevel optimization techniques to a logic circuit. The user can 
experiment with different optimization strategies by applying techniques such as factoring 
and decomposition to all or part of a circuit. 

Numerous commercial CAD systems are on the market. Some companies whose pro- 
ducts are widely used are Altera, Cadence Design Systems, Mentor Graphics, Synopsys, and 
Xilinx. Information on their products is available on the World Wide Web. Each company 
provides logic synthesis software that can be used to target various types of chips, such as 
FPGAs, gate arrays, standard cells, and custom chips. Because there are many possible 
ways to synthesize a given circuit, as we saw in the previous sections, each commercial 
product uses a proprietary logic optimization strategy based on heuristic methods. 
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To describe CAD tools, some new terminology has been invented. In particular, we 
should mention two terms that are widely used in industry: technology-independent logic 
synthesis and technology mapping. The first term refers to techniques that are applied when 
optimizing a circuit without considering the resources available in the target chip. Most 
of the techniques presented in this chapter are of this type. The second term, technology 
mapping, refers to techniques that are used to ensure that the circuit produced by logic 
synthesis can be realized using the logic resources available in the target chip. A good 
example of technology mapping is the transformation from a circuit in the form of logic 
operations such as AND and OR into a circuit that consists of only NAND operations. This 
type of technology mapping is done when targeting a circuit to a gate array that contains 
only NAND gates. Another example is the translation from logic operations to lookup 
tables, which is done when targeting a design to an FPGA. 

Chapter 10 discusses the CAD tools in detail. It presents a typical design flow that a 
designer may use to implement a digital system. 


8.5 CONCLUDING REMARKS 


This chapter has attempted to provide the reader with an understanding of various aspects 
of synthesis for logic functions. In the next chapter we discuss asynchronous sequential 
circuits. In the design of these circuits we will make use of many of the synthesis techniques 
covered in the previous chapters, as well as the cubical notation that we introduced in this 
chapter. 


8.6 EXAMPLES OF SOLVED PROBLEMS 


This section presents some typical problems that the reader may encounter, and shows how 
such problems can be solved. 


Problem: Consider the four-input logic function Example 8.19 
f = X1X2K4 + X2H3X4 +X 1X23 
Show how this function can be implemented in an FPGA that has three-input LUTs. 


Solution: A straightforward implementation requires 4 three-input LUTs. Three of these 
LUTs are used for the AND operations, and the final LUT implements the OR operation. 
But, if the function is synthesized as 


Sf = X20 (x1K4 + X34) + 2X2 + 1X3) 


then only three LUTs are needed, as illustrated in Figure 8.33. The first LUT produces 
the subfunction g = x;x4 + x3X4, the second LUT generates h = X;x3, and the final LUT 
produces f = x.-g+X2-h. 
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a) 


Figure 8.33 = The function in Example 8.19. 


Example 8.20 Problem: In Example 8.11 we created a BDD for the function f = x1x3 + x1.x4 + x2%4 + 


X2X3 + X1X2x3x4. Use functional decomposition to derive a multilevel implementation of 
jf. Consider subfunctions of the inputs x3 and x4. 


Solution: Figure 8.34 shows f in the form of a Karnaugh map. It has the subfunction 
h = x3x4 in the first column, and the subfunction g = x3 + x4 in the second, third, and 
fourth columns. Hence 


Sf = 1X2) -h + x2 + X1X2 + 1X2) + gv 
= (%1X2) + X3X4 + (1 + x2) - (43 + X4) 


This is the same expression that was derived from the BDD in Example 8.11. It is possible 
to simplify this expression. Let k = (x; + x2), so that 


f =k-xuxm+k - (x3 + x4) 
=k-xgxgtk-x3 tk -x4 
=k-xuxmy+k »x3X4 $k x3 +k + x4 
= (k +k) + x3x4-+k (x3 + x4) 
= x3x4 + k + (x3 + x4) 
Here, we have used the absorption rule 13a from Section 2.5 to expand k - x3 into the two 


terms k - x3x4 + k - x3, which then leads to a simplification. Resubstituting k = (x, + x2) 
gives 


f = X3x4 + (x1 + x2) + (03 + X4) 
We can also derive this decomposition of f by examining the rows of the Karnaugh map 


in Figure 8.34 and identifying subfunctions of variables x; and x2. The second and fourth 
rows are the subfunction k = x; + x2, and the third row is the subfunction equal to 1. Thus 
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Figure 8.34 The function for Example 8.20. 


Sf = (%3xq + x3X4) > kK + (3x4) - 1 
= (3X4) > kK + (3X4) - kK + (43.x4) - k + x34 
= (%3x4 + X3X4 + X3X4) - kK + .x3%4 
= (x3 + x4) +k + .X3X4 
= (x3 +.X4) + (1 + X2) + 3x4 


Problem: Show how Shannon’s expansion can be used to create a BDD for the function Example 8.21 
Ff =X x3 + x2x4, with the input order x1, x2, 3, x4. 


Solution: Shannon’s expansion with respect to x; gives f = Xie, + xift,. The cofactors 
are 
fi, = 0+ x3 + X0%4 = 2X4 
te =1-x3 + 49N4 = 13 + X0K4 
For the cofactor f,, = x3 + x2x4 itis useful to apply Shannon’s expansion again, with respect 
to input x2. This gives 
tox =%3+0-%4 = x3 
Fey = 34+ 1+ x4 = 93 +4 


The resulting BDD is shown in Figure 8.35. The BDD for f¢, = x2x4 is highlighted in blue. 
The BDD for f,, is on the righthand side of the figure, with the subfunction f,,;, = x3 on the 
0-edge from node x2 and the subfunction for f,,,, = x3 + x4 on the 1-edge from node xp. 
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Figure 8.35 — The BDD for Example 8.21. 


Example 8.22 Problem: Create a BDD for the function used in Example 8.21 with the input order x1, 


X3,X2, X4. 


Solution: We can derive the BDD by swapping the inputs x2 and x3 in the BDD of Figure 
8.35. The first step is to isolate the part of the BDD that contains nodes x2 and x3, as shown 
by the dotted lines in Figure 8.36a. There are two edges, labeled E| and £5, that enter this 
part of the BDD. There is no reordering needed for edge EF, because it does not depend 
on node x3. But for edge £2, it is useful to create the truth table shown in Figure 8.36b to 
enumerate the paths in the BDD that start at Ey. The path representing x2x3 = 00 leads to 
the terminal node 0, the path for x2x3 = 10 leads to node x4, and the remaining two paths 
lead to the terminal node 1. 

We can now create the graph shown in Figure 8.37a, ensuring that all paths from edge 
E, reach the correct destination nodes shown in the truth table. Removing the redundant 
node x2 on the righthand side of this graph, and also merging the two remaining identical 
copies of node x2, results in the BDD in Figure 8.37b. This BDD is simpler than the one in 
Figure 8.35. 


Example 8.23 Problem: Use the tabular method of Section 8.4.1 to derive aminimum-cost SOP expression 


for the function 
Sf (1, 2. 4) = HH 3X4 + 3X4 + A HoNg + Xp ADNZX4 


assuming that there are also don’t-cares defined as D = (9, 12, 14). 
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wo 3 E, 

0 0 0 

0 1 1 

1 0 X4 

i of 1 
(a) Isolating x, and x; (b) Truth table 


Figure 8.36 _—_ Reordering the BDD in Figure 8.35. 


(a) Reordered tree (b) Order x,, x3, x5, X4 


Figure 8.37 — The BDD for Example 8.22. 
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List 2 List 3 


3,7,11,15 }x x11 


9,11,13,15} 1 xxl 
12,13,14,15} 1 1 x x 


<< 


Aj <<848 <<484 486 8 


a<<4- 464448448 


Figure 8.38 — Generation of prime implicants for the function in Example 8.23. 


Solution: The tabular method requires that we start with the function defined in the form 
of minterms. As found in Figure 2.68, the function f can also be represented as 


far, ..-,44) = Y>m(0, 1,3, 4,7, 11, 13, 15) + DQ, 12, 14) 


The corresponding eleven 0-cubes are placed in list | in Figure 8.38. 

Now, perform a pairwise comparison of all O-cubes to determine the 1-cubes shown 
in list 2, which are obtained by combining pairs of O-cubes. Note that all 0-cubes are 
included in the 1-cubes, as indicated by the checkmarks in list 1. Next, perform a pairwise 
comparison of all 1-cubes to obtain the 2-cubes in list 3. Some of these 2-cubes can be 
generated in multiple ways, but it is not useful to list a 2-cube more than once (for example, 
xOx1 in list 3 can be obtained by combining from list 2 the cubes 1,3 and 9,11 or by using 
the cubes 1,9 and 3,11). Note that all but three 1-cubes are included in the 2-cubes. It is not 
possible to generate any 3-cubes, hence all terms that are not included in some other term 
(the unchecked terms in list 2 and all terms in list 3) are the prime implicants of f. The set 
of prime implicants is 


P = {000x, 0x00, x100, xOx1, xx11, 1xx1, 11xx} 
= {P1, P2; P3; P4s Ps, Po, P7} 


To find the minimum-cost cover for f, construct the table in Figure 8.39a which shows 
all prime implicants and the minterms that must be covered, namely those for which f = 1. 
A checkmark is placed to indicate that a minterm is covered by a particular prime implicant. 
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Prime Minterm 
implicant 0 1 3 4 7 «11 13 =15 


(b) After the removal of rows p3, ps and pz, and columns 3, 7, 11 and 15 


Figure 8.39 — Selection of a cover for the function in Example 8.23. 


Since minterm 7 is covered only by ps, this prime implicant must be included in the final 
cover. Observe that row p2 dominates row p3, hence the latter can be removed. Similarly, 
row po dominates row p7. Removing rows ps, p3, and p7, as well as columns 3, 7, 11, and 
15 (which are covered by ps), leads to the reduced table in Figure 8.39b. In this table, pz 
and p¢ are essential. They cover minterms 0, 4, and 13. Thus, it remains only to cover 
minterm 1, which can be done by choosing either p; or p4. Since p4 has a lower cost, it 
should be chosen. Therefore, the final cover is 


C = {p2, Pa, Ps, Po} 
= {0x00, xOx1, xx11, 1xx1} 


and the function is implemented as 


Sf = XX3X4 + XoX4 + 3x4 + 1X4 
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Example 8.24 


Problem: Use the «-product operation to find all prime implicants of the function 

SL, «++ 4) = Hy X3H4 + AZAXq A HYXA4 + XY XIQ3A4 
assuming that there are also don’t-cares defined as D = )°(9, 12, 14). 
Solution: The ON-set for this function is 

ON = {0x00, xx11, 00x1, 1101} 
The initial cover, consisting of the ON-set and the don’t-cares, is 
C° = {0x00, xx11, 00x1, 1101, 1001, 1100, 1110} 

Using the *-operation, the subsequent covers obtained are 


C! = {0x00, xx11, 00x1, 000x, x100, 11x1, 10x1, 111x, x001, 1x01, 110x, 11x0} 
C* = {0x00, xx11, 000x, x100, xOx1, 1xx1, 11xx} 
CGC a C2 


Therefore, the set of all prime implicants is 


P = {X1X3X4, X3X4, X1XQX3, X2N3X4, X2X4, X1X4, XX} 


Example 8.25 


Problem: Find the minimum-cost implementation for the function 
Sf (XL, «6 X4) = Hy X3X4q + 3X4 + Ny HH4 + Xp XINZX4 


assuming that there are also don’t-cares defined as D = “SO, 12, 14). 
Solution: This is the same function used in Examples 8.23 and 8.24. In those examples, 
we found that the minimum-cost SOP implementation is 


f = Xx3xX4 + X1X3X4 + X2X4 + XL X4 


which requires four AND gates, one OR gate, and 13 inputs to the gates, for a total cost 
of 18. 
The minimum-cost POS implementation is 


f = (3 + x4) 1 + x4) (01 +X + x3 + X4) 


which requires three OR gates, one AND gate, and 11 inputs to the gates, for a total cost 
of 15. 

We can also consider a multilevel realization for the function. Applying the factoring 
concept to the above SOP expression yields 


SF = (1 +X + x3 )X4 + X1X3X4 
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Figure 8.40 = Circuit for Example 8.26. 


This implementation requires two AND gates, two OR gates, and 10 inputs to the gates, for 
a total cost of 14. Compared to the SOP and POS implementations, this has the lowest cost 
in terms of gates and inputs, but it results in a slower circuit because there are three levels 
of gates through which the signals must propagate. 


Problem: In several commercial FPGAs the logic blocks are four-input LUTs. Two such Example 8.26 
LUTs, connected as shown in Figure 8.40, can be used to implement functions of seven 
variables by using the decomposition 


fi, ...,%7) =f ga, ..., 44), X5, X6, X7] 


It is easy to see that functions such as f = x) X2x3x4Xx5X6xX7 and f = x; +X. +%x%3+4%4+ 
x5 +X + x7 can be implemented in this form. Show that there exist other seven-variable 
functions that cannot be implemented with 2 four-input LUTs. 


Solution: The truth table for a seven-variable function can be arranged as depicted in 
Figure 8.41. There are 27 = 128 minterms. Each valuation of the variables x,, x2, x3, and 
x4 selects one of the 16 columns in the truth table, while each valuation of x5, x5, and x7 
selects one of 8 rows. Since we have to use the circuit in Figure 8.40, the truth table for 
Ff can also be defined in terms of the subfunction g. In this case, it is g that selects one of 
the 16 columns in the truth table, instead of x1, x2, x3, and x4. Since g can have only two 
possible values, 0 and 1, we can have only two columns in the truth table. This is possible 
if there exist only two distinct patterns of 1s and Os in the 16 columns in Figure 8.40. 
Therefore, only a relatively small subset of seven-variable functions can be realized with 
just two LUTs. 
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Xj X4XZX4 


0000 0001 


Figure 8.41 — A possible format for truth tables of seven-variable 
functions. 


*8.8 


PROBLEMS 


Answers to problems marked by an asterisk are given at the back of the book. 
Implement the logic circuit in Figure 8.6b using NAND gates only. 
Implement the logic circuit in Figure 8.6b using NOR gates only. 

Implement the logic circuit in Figure 8.8b using NAND gates only. 
Implement the logic circuit in Figure 8.8b using NOR gates only. 


Consider the function f = x3x5 + X1xXo%X4 + X1X2X4 + X1X3H4 + Xp A3X4 + XpAQNs + XpX2X5. 
Derive a minimum-cost circuit that implements this function using NOT, AND, and OR 
gates. 


Derive a minimum-cost circuit that implements the function f (x1, ...,x4) = }> m(4, 7, 8, 
11) + D(12, 15). 

Find the simplest realization of the function f (x1, ...,x4) = > m(0, 3, 4, 7,9, 10, 13, 14), 
assuming that the logic gates have a maximum fan-in of two. 

Find the minimum-cost circuit for the function f(x, ...,x4) = }° m(0, 4, 8, 13, 14, 15). 


Assume that the input variables are available in uncomplemented form only. (Hint: Use 
functional decomposition.) 


Use functional decomposition to find the best implementation of the function f(x, ..., 
x5) = >> m(1, 2,7, 9, 10, 18, 19, 25, 31) + D(O, 15, 20, 26). How does your implementa- 
tion compare with the lowest-cost SOP implementation? Give the costs. 


In Figure 8.7 we decomposed the function f by first identifying subfunctions in the columns 
of the Karnaugh map. Repeat this decomposition by first identifying subfunctions in the 
rows of the Karnaugh map. 
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In Figure 8.8 we decomposed the function f by first identifying subfunctions in the rows 
of the Karnaugh map. Repeat this decomposition by first identifying subfunctions in the 
columns of the Karnaugh map. 


Create a BDD for the function in Figure 8.6 using the input order x;, x2, x3, X4. 
Repeat Problem 8.12 using the input order x4, x3, x2, x1. 

Create a BDD for the function in Figure 8.8 using the input order x1, x2, x3, X4, Xs. 
Repeat Problem 8.14 using the input order x5, x4, x3, X2, X1. 


For the function f = x,x3 + x2x4 in Example 8.21 show how to use Shannon’s expansion 
to derive the BDD in Figure 8.37). 


Show how to swap the variables x2 and x3 in the BDD in Figure 8.37 to derive the BDD 
in Figure 8.35. 


Use the tabular method discussed in Section 8.4.1 to find a minimum cost SOP realization 
for the function 


PGi a= YO, 24,5, 7,8.9; 15) 
Repeat Problem 8.18 for the function 


FQ, ..-,%4) = Y= m(0, 4, 6, 8,9, 15) + D(3, 7, 11, 13) 


Repeat Problem 8.18 for the function 


FH 845%4) = Syme, 3,4,5, 7,9, 11) + D(8, 12, 13, 14) 


Show that the following distributive-like rules are valid 
(A - B)}#C = (A#C) - (BEC) 
(A + B)#C = (A#C) + (BHC) 


Use the cubical representation and the method discussed in Section 8.4.2 to find a minimum- 
cost SOP realization of the function f (x1, ...,x4) = }> m(0, 2, 4, 5, 7, 8, 9, 15). 


Repeat Problem 8.22 for the function f(x), ...,%5) = X1X3X5 + XyxXoX3 + X2x3X4X5 + 
Xj XQXZH4 + Ny AYXR3X4X5 + Xj XIN 4Xs + AXjX3ZX4X5. 


Use the cubical representation and the method discussed in Section 8.4.2 to find a minimum- 
cost SOP realization of the function f (x1, ..., x4) defined by the ON-set ON = {00x0, 100x, 
x010, 1111} and the don’t-care set DC = {00x1, O11x}. 


In Section 8.4.2 we showed how the *-product operation can be used to find the prime 
implicants of a given function f. Another possibility is to find the prime implicants by 
expanding the implicants in the initial cover of the function. An implicant is expanded 
by removing one literal to create a larger implicant (in terms of the number of vertices 
covered). A larger implicant is valid only if it does not include any vertices for which 
jf =0. The largest valid implicants obtained in the process of expansion are the prime 
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X1X_X3 
X4X3 X 1x3 XX 
3X2 X3 2 XY 
NO NO NO NO 
Figure P8.1 Expansion of implicant X23. 


implicants. Figure P8.1 illustrates the expansion of the minterm X|x2.x3 of the function used 
in Example 8.14. Note from Figure 2.56 that 


Sf = X1X2X3 + x1 X2X3 +X X2X3 


In Figure P8.1 the word NO is used to indicate that the expanded term is not valid, 
because it includes one or more vertices from f. From the graph it is clear that the largest 
valid implicants that arise from this expansion are x2x3 and X,; they are prime implicants 
of f. 

Expand the other four minterms given in the initial cover in Example 8.14 to find all 
prime implicants of f. What is the relative complexity of this procedure compared to the 
*-product technique? 


Repeat Problem 8.25 for the function in Example 8.15. Expand the implicants given in the 
initial cover C°. 


Find the minimum-cost circuit consisting only of two-input NAND gates for the function 
fQ1,..+,%4) = = m(0, 1, 2,3, 4, 6, 8,9, 12). Assume that the input variables are avail- 
able in both uncomplemented and complemented forms. (Hint: Consider the complement 
of the function.) 

Repeat Problem 8.27 for the function f (x1, ...,x4) = m(2, 3, 6, 8,9, 12). 


Find the minimum-cost circuit consisting only of two-input NOR gates for the function 
f(x, ...,x4) = > m(6, 7, 8, 10, 12, 14, 15). Assume that the input variables are available 
in both uncomplemented and complemented forms. (Hint: Consider the complement of 
the function.) 


Repeat Problem 8.29 for the function f (x, ...,.x4) = }> m(2, 3, 4,5, 9, 10, 11, 12, 13, 15). 
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ASYNCHRONOUS SEQUENTIAL CIRCUITS 


CHAPTER OBJECTIVES 


In this chapter you will learn about: 


e Sequential circuits that are not synchronized by a clock 
e Analysis of asynchronous sequential circuits 

e Synthesis of asynchronous sequential circuits 

e The concept of stable and unstable states 

e Hazards that cause incorrect behavior of a circuit 

e Timing issues in digital circuits 
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In Chapter 6 we covered the design of synchronous sequential circuits in which the state variables are 
represented by flip-flops that are controlled by a clock. The clock is a periodic signal that consists of pulses. 
Changes in state can occur on the positive or negative edge of each clock pulse. Since they are controlled by 
pulses, synchronous sequential circuits are said to operate in pulse mode. In this chapter we present sequential 
circuits that do not operate in pulse mode and do not use flip-flops to represent state variables. These circuits 
are called asynchronous sequential circuits. 

In an asynchronous sequential circuit, changes in state are not triggered by clock pulses. Instead, changes 
in state are dependent on whether each of the inputs to the circuit has the logic level 0 or | at any given time. 
To achieve reliable operation, the inputs to the circuit must change in a specific manner. In this introductory 
discussion we will concentrate on the simplest case in which a constraint is imposed that the inputs must 
change one at a time. Moreover, there must be sufficient time between the changes in input signals to allow 
the circuit to reach a stable state, which is achieved when all internal signals stop changing. A circuit that 
adheres to these constraints is said to operate in the fundamental mode. 

Asynchronous circuits are much more difficult to design than synchronous circuits. Specialized tech- 
niques, which are beyond the scope of this book, have been developed for dealing with large asynchronous 
circuits. Our main reason for the discussion in this chapter is the fact that the asynchronous circuits, even 
in their simplest form, provide an excellent vehicle for gaining a deeper understanding of the operation of 
digital circuits in general. In particular, they illustrate the timing issues caused by propagation delays in logic 
circuits. 

The design approaches presented in this chapter are classical techniques that are suitable only for very 
small circuits. They are easy to understand and they demonstrate the problems that arise from timing con- 
straints. In synchronous circuits these problems are avoided by using a clock as a synchronizing mechanism. 


| 9.1 ASYNCHRONOUS BEHAVIOR 


To introduce asynchronous sequential circuits, we will reconsider the basic latch circuit in 
Figure 5.3. This Set-Reset (SR) latch is redrawn in Figure 9.1a. The feedback loop gives 
rise to the sequential nature of the circuit. It is an asynchronous circuit because changes in 
the value of the output, Q, occur without having to wait for a synchronizing clock pulse. In 
response to a change in either the S (Set) or R (Reset) input, the value of Q will change after 
a short propagation time through the NOR gates. In Figure 9.1a the combined propagation 
delay through the two NOR gates is represented by the box labeled A. Then, the NOR 
gate symbols represent ideal gates with zero delay. Using the notation in Chapter 6, Q 
corresponds to the present state of the circuit, represented by the present-state variable, y. 
The value of y is fed back through the circuit to generate the value of the next-state variable, 
Y, which represents the next state of the circuit. After the A time delay, y takes the value 
of Y. Observe that we have drawn the circuit in a style that conforms to the general model 
for sequential circuits presented in Figure 6.85. 

By analyzing the SR latch, we can derive a state-assigned table, as illustrated in Figure 
9.1b. When the present state is y = 0 and the inputs are S = R = 0, the circuit produces 
Y = 0. Since y = Y, the state of the circuit will not change. We say that the circuit is stable 
under these input conditions. Now assume that R changes to 1 while S remains at 0. The 
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(a) Circuit with modeled gate delay 


Present 
state SR=00 Ol 1 


OlORRO 
@o @ 9 


(b) State-assigned table 


Figure 9.1 ~~ Analysis of the SR latch. 


circuit still generates Y = 0 and remains stable. Assume next that S changes to | and R 
remains at 1. The value of Y is unchanged, and the circuit is stable. Then let R change to 0 
while S remains at 1. This input valuation, SR = 10, causes the circuit to generate Y = 1. 
Since y # Y, the circuit is not stable. After the A time delay, the circuit changes to the new 
present state y = 1. Once this new state is reached, the value of Y remains equal to | as 
long as SR = 10. Hence the circuit is again stable. The analysis for the present state y = 1 
can be completed using similar reasoning. 

The concept of stable states is very important in the context of asynchronous sequential 
circuits. For a given valuation of inputs, if a circuit reaches a particular state and remains in 
this state, then the state is said to be stable. To clearly indicate the conditions under which 
the circuit is stable, it is customary to encircle the stable states in the table, as illustrated in 
Figure 9.15. 

From the state-assigned table, we can derive the state table in Figure 9.2a. The state 
names A and B represent the present states y = 0 and y = 1, respectively. Since the output 
Q depends only on the present state, the circuit is a Moore-type FSM. The state diagram 
that represents the behavior of this FSM is shown in Figure 9.2. 

The preceding analysis shows that the behavior of an asynchronous sequential circuit 
can be represented as an FSM in a similar way as the synchronous sequential circuits in 
Chapter 6. Consider now performing the opposite task. That is, given the state table in 
Figure 9.2a, we can synthesize an asynchronous circuit as follows: After performing the 
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Present Next state Output 
state | sr=00 01 10 11] & 


A (A) (A) B (A)| 0 
B (B) A @) Al 1 


(a) State table 


SR 
10 


01 
11 


(b) State diagram 


Figure 9.2 = FSM model for the SR latch. 


state assignment, we have the state-assigned table in Figure 9.1b. This table represents a 
truth table for Y, with the inputs y, S, and R. Deriving a minimal product-of-sums expression 
yields 


Y=R-(S+y) 


If we were deriving a synchronous sequential circuit using the methods in Chapter 6, then 
Y would be connected to the D input of a flip-flop and a clock signal would be used to 
control the time when the changes in state take place. But since we are synthesizing an 
asynchronous circuit, we do not insert a flip-flop in the feedback path. Instead, we create a 
circuit that realizes the preceding expression using the necessary logic gates, and we feed 
back the output signal as the present-state input y. Implementation using NOR gates results 
in the circuit in Figure 9.1a. This simple example suggests that asynchronous circuits and 
synchronous circuits can be synthesized using similar techniques. However, we will see 
shortly that for more complex asynchronous circuits, the design task is considerably more 
difficult. 

To further explore the nature of asynchronous circuits, it is interesting to consider how 
the behavior of the SR latch can be represented in the form of a Mealy model. As depicted 
in Figure 9.3, the outputs produced when the circuit is in a stable state are the same as 
in the Moore model, namely 0 in state A and | in state B. Consider now what happens 
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Present Next state Output, Q 
state | sR=00 O1 10 11/00 O1 10 11 


A (A) (A) B ()] 0 0 - 0 
B (B) A @®) Ali - 1 = 


(a) State table 


SRQ 
107— 
oes 00/1 
ae 10/1 
11/0 
O1/- 


11/- 
(b) State diagram 


Figure 9.3. Mealy representation of the SR latch. 


when the state of the circuit changes. Suppose that the present state is A and that the input 
valuation SR changes from 00 to 10. As the state table specifies, the next state of the FSM 
is B. When the circuit reaches state B, the output Q will be 1. But in the Mealy model, the 
output is supposed to be affected immediately by a change in the input signals. Thus while 
still in state A, the change in SR to 10 should result in Q = 1. We could have written a 1 
in the corresponding entry in the top row of the state table, but we have chosen to leave 
this entry unspecified instead. The reason is that since Q will change to | as soon as the 
circuit reaches state B, there is little to be gained in trying to make Q go to | a little sooner. 
Leaving the entry unspecified allows us to assign either 0 or | to it, which may make the 
circuit that implements the state table somewhat simpler. A similar reasoning leads to the 
conclusion that the two output entries where a change from B to A takes place can also be 
left unspecified. 

Using the state assignment y = 0 for A and y = | for B, the state-assigned table rep- 
resents a truth table for both Y and Q. The minimal expression for Y is the same as for the 
Moore model. To derive an expression for Q, we need to set the unspecified entries to 0 or 
1. Assigning a 0 to the unspecified entry in the first row and | to the two unspecified entries 
in the second row produces Q = y and results in the circuit in Figure 9.1a. 


Terminology 


In the preceding discussion we used the same terminology as in the previous chapter 
on synchronous sequential circuits. However, when dealing with asynchronous sequential 
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circuits, it is customary to use two different terms. Instead of a “state table,” it is more 
common to speak of a flow table, which indicates how the changes in state flow as a result 
of the changes in the input signals. Instead of a “state-assigned table,” it is usual to refer 
to a transition table or an excitation table. We will use the terms flow table and excitation 
table in this chapter. A flow table will define the state changes and outputs that must be 
generated. An excitation table will depict the transitions in terms of the state variables. The 
term excitation table derives from the fact that a change from a stable state is performed by 
“exciting” the next-state variables to start changing towards a new state. 


9.2 ANALYSIS OF ASYNCHRONOUS CIRCUITS 


To gain familiarity with asynchronous circuits, it is useful to analyze a few examples. We 
will keep in mind the general model in Figure 6.85, assuming that the delays in the feedback 
paths are a representation of the propagation delays in the circuit. Then each gate symbol 
will represent an ideal gate with zero delay. 


Example 9.1 


GATED D LATCH In Chapters 5 and 6, we used the gated D latch as a key component in 
circuits that are controlled by a synchronizing clock. It is instructive to analyze this latch as 
an asynchronous circuit, where the clock is just one of the inputs. It is reasonable to assume 
that the signals on the D and clock inputs do not change at the same time, thus meeting the 
basic requirement of asynchronous circuits. 

Figure 9.4a shows the gated D latch drawn in the style of the model of Figure 6.85. This 
circuit was introduced in Figure 5.7 and discussed in Section 5.3. The next-state expression 
for this circuit is 


Y=(CtD) t (CTD) ty) 
= CD + Cy + Dy 


The term Dy in this expression is redundant and could be deleted without changing the logic 
function of Y. Hence the minimal expression is 


Y=CD+Cy 


The reason that the circuit implements the redundant term Dy is that this term solves a race 
condition known as a hazard; we will discuss hazards in detail in Section 9.6. 

Evaluating the expression for Y for all valuations of C, D, and y leads to the excitation 
table in Figure 9.4b. Note that the circuit changes its state only when C = | and D is 
different from the present state, y. In all other cases the circuit is stable. Using the symbols 
A and B to represent the states y = 0 and y = 1, we obtain the flow table and the state 
diagram shown in parts (c) and (d). 
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(a) Circuit 
Present Next state 
state cD=00 O01 10 11 
di ¥ Y¥ ¥ ¥/|O 
? QQ @ 1] 0 
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(b) Excitation table 


Present Next state 
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state | CcD=00 O01 10 11 


A (A) (A) (A) B | 0 
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(c) Flow table 
CD ut 
Ox Ox 
x0 xl 
10 


(d) State diagram 


The gated D latch. 


Figure 9.4 
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Example 9.2 


MASTER-SLAVE D FLIP-FLOP ——In Example 9.1 we analyzed the gated D latch as an asyn- 
chronous circuit. Actually, all practical circuits are asynchronous. However, if the circuit’s 
behavior is tightly controlled by a clock signal, then simpler operating assumptions can 
be used, as we did in Chapter 6. Recall that in a synchronous sequential circuit all sig- 
nals change values in synchronization with the clock signal. Now we will analyze another 
synchronous circuit as if it were an asynchronous circuit. 

Two gated D latches are used to implement the master-slave D flip-flop, as illustrated in 
Figure 5.9. This circuit is reproduced in Figure 9.5. We can analyze the circuit by treating 
it as a series connection of two gated D latches. Using the results from Example 9.1, the 
simplified next-state expressions can be written as 


Yin =CD+ Ch 
¥o —_ Cym + Cys 


where the subscripts m and s refer to the master and slave stages of the flip-flop. These 
expressions lead to the excitation table in Figure 9.6a. Labeling the four states as $1 through 
S4, we derive the flow table in Figure 9.6b. A state-diagram form of this information is 
given in Figure 9.7. 

Let us consider the behavior of this FSM in more detail. The state $1, where y,,yv, = 00, 
is stable for all input valuations except CD = 11. When C = 1, the value of D is stored in 
the master stage; hence CD = 11 causes the flip-flop to change to $3, where y,, = 1 and 
ys = 0. If the D input now changes back to 0, while the clock remains at 1, the flip-flop 
moves back to the state $1. The transitions between $1 and $3 indicate that if C = 1, 
the output of the master stage, Q,,, = Ym, tracks the changes in the D input signal without 
affecting the slave stage. From S§3 the circuit changes to $4 when the clock goes to 0. In $4 
both master and slave stages are set to 1 because the information from the master stage is 
transferred to the slave stage on the negative edge of the clock. Now the flip-flop remains 
in S4 until the clock goes to | and the D input changes to 0, which causes a change to $2. 
In $2 the master stage is cleared to 0, but the slave stage remains at |. Again the flip-flop 
may change between $2 and $4 because the master stage will track the changes in the D 
input signal while C = 1. From $2 the circuit changes to S'1 when the clock goes low. 

In Figures 9.6 and 9.7, we indicated that the flip-flop has only one output Q, which one 
sees when the circuit is viewed as a negative-edge-triggered flip-flop. From the observer’s 


Master Slave 


Figure 9.5 — Circuit for the master-slave D flip-flop. 
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(a) Excitation table 
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Present Next state Output 
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(c) Flow table with unspecified entries 


Figure 9.6 —_ Excitation and flow tables for Example 9.2. 


point of view, the flip-flop has only two states, 0 and 1. But internally, the flip-flop consists 
of the master and slave parts, which gives rise to the four states described above. 

We should also examine the basic assumption that the inputs must change one at a time. 
If the circuit is stable in state $2, for which CD = 10, it is impossible to go from this state 
to $1 under the influence of the input valuation CD = 01 because this simultaneous change 
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Ox 
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Figure 9.7 — State diagram for the master-slave D flip-flop. 


in both inputs cannot occur. Thus in the second row of the flow table, instead of showing 
S2 changing to $1 under CD = 01, this entry can be labeled as unspecified. The change 
from $2 to $1 can be caused only by CD changing from 10 to 00. Similarly, if the circuit 
is in state $3, where CD = 11, it cannot change to $4 by having CD = 00. This entry can 
also be left unspecified in the table. The resulting flow table is shown in Figure 9.6c. 

If we reverse the analysis procedure and, using the state assignment in Figure 9.6a, 
synthesize logic expressions for Y,, and Y;, we get 


Yin = CD + CVn + YmD 
Y, = Cyn + Cys + YmYs 


The terms y,,D and y,,ys in these expressions are redundant. As mentioned earlier, they are 
included in the circuit to avoid race conditions, which are discussed in Section 9.6. 


Example 9.3 


Consider the circuit in Figure 9.8. It is represented by the following expressions 


Y, = yiy2 + wiy2 + Wiw2y1 
Y2 = yiy2 + Wiy2 + w2 + W1Wryy 
Z=Y\y2 


The corresponding excitation and flow tables are given in Figure 9.9. 

Some transitions in the flow table will not occur in practice because of the assumption 
that both w; and w2 cannot change simultaneously. In state A the circuit is stable under the 
valuation w2w; = 00. Its inputs cannot change to 11 without passing through the valuations 
01 or 10, in which case the new state would be B or C, respectively. Thus the transition 
from A under w2w; = 11 can be left unspecified. Similarly, if the circuit is stable in state 
B, in which case w2w; = 01, it is impossible to force a change to state D by changing the 
inputs to w2w, = 10. This entry should also be unspecified. If the circuit is stable in state C 
under w2w; = II, it is not possible to go to A by changing the inputs directly to w2w; = 00. 
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Figure 9.8 — Circuit for Example 9.3. 


However, the transition to A is possible by changing the inputs one at a time because the 
circuit remains stable in C for both w2w; = 01 and w2w, = 10. 

A different situation arises if the circuit is stable in state D under w2w; = 00. It may 
seem that the entry under w2w; = 11 should be unspecified because this input change 
cannot be made from the stable state D. But suppose that the circuit is stable in state B 
under w2w,; = 01. Now let the inputs change to w2w, = 11. This causes a change to state 
D. The circuit indeed changes to D, but it is not stable in this state for this input condition. 
As soon as it arrives into state D, the circuit proceeds to change to state C as required by 
w2w, = II. It is then stable in state C as long as both inputs remain at 1. The conclusion 
is that the entry that specifies the change from D to C under w2w; = 11 is meaningful and 
should not be omitted. The transition from the stable state B to the stable state C, which 
passes through state D, illustrates that it is not imperative that all transitions be directly from 
one stable state to another. A state through which a circuit passes en route from one stable 
state to another is called an unstable state. Transitions that involve passing through an 
unstable state are not harmful as long as the unstable state does not generate an undesirable 
output signal. For example, if a transition is between two stable states for which the output 
signal should be 0, it would be unacceptable to pass through an unstable state that causes 
the output to be 1. Even though the circuit changes through the unstable state very quickly, 
the short glitch in the output signal is likely to be troublesome. This is not a problem in our 
example. When the circuit is stable in B, the output is z = 0. When the inputs change to 
w2w , = II, the transition to state D maintains the output at 0. It is only when the circuit 
finally changes into state C that z will change to 1. Therefore, the change from z = 0 to 
z = | occurs only once during the course of these transitions. 
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(b) Flow table 


Figure 9.9 — Excitation and flow tables for the circuit in Figure 9.8. 
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Figure 9.10 = Modified flow table for Example 9.3. 


A modified flow table, showing the unspecified transitions, is presented in Figure 9.10. 
The table indicates the behavior of the circuit in Figure 9.8 in terms of state transitions. If 
we don’t know what the circuit is supposed to do, it may be difficult to discover the practical 
application for a given circuit. Fortunately, in practice the purpose of the circuit is known, 
and the analysis is done by the designer to ascertain that the circuit performs as desired. In 
our example it is apparent that the circuit generates the output z = 1| in state C, which it 
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reaches as a result of some input patterns that are detected using the other three states. The 
state diagram derived from Figure 9.10 is shown in Figure 9.11. 

This diagram actually implements a control mechanism for a simple vending machine 
that accepts two types of coins, say, dimes and nickels, and dispenses merchandise such as 
candy. If w; represents a nickel and w2 represents a dime, then a total of 10 cents must be 
deposited to get the FSM into state C where the candy is released. The coin mechanism 
accepts only one coin at a time, which means that w2w,; = 11 can never occur. Therefore, 
the transition discussed above, from B to C, through the unstable state D would not occur. 
Observe that both states B and D indicate that 5 cents has been deposited. State B indicates 
that a nickel is presently being sensed by the coin receptor, while D indicates that 5 cents 
has been deposited and the coin receptor is presently empty. In state D it is possible to 
deposit either a nickel or a dime, both leading to state C. No distinction is made between 
the two types of coins in state D; hence the machine would not give change if 15 cents 
is deposited. From state A a dime leads directly to state C. Knowing that the condition 
w2w, = 11 will not occur allows the flow table to be specified as shown in Figure 9.12. If 


WW 
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10 00 11 
< 00 
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Figure 9.11 State diagram for Example 9.3. 
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Figure 9.12 —_ Flow table for a simple vending machine. 
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we were to synthesize the sum-of-products logic expressions for Y; and Y, using the state 
assignment in Figure 9.9a, we would end up with the circuit in Figure 9.8. 


Steps in the Analysis Process 


We have demonstrated the analysis process using illustrative examples. The required 
steps can be stated as follows: 


e A given circuit is interpreted in the form of the general model in Figure 6.85. That is, 
each feedback path is cut, and a delay element is inserted at the point where the cut 
is made. The input signal to the delay element represents a corresponding next-state 
variable, Y;, while the output signal is the present-state variable, y;. A cut can be made 
anywhere in a particular loop formed by the feedback connection, as long as there is 
only one cut per (state variable) loop. Thus the number of cuts that should be made 
is the smallest number that results in there being no feedback anywhere in the circuit 
except from the output of a delay element. This minimal number of cuts is sometimes 
referred to as the cut set. Note that the analysis based on a cut made at one point in a 
given loop may not produce the same flow table as an analysis on a cut made at some 
other point in this loop. But both flow tables would reflect the same functional behavior 
in terms of the applied inputs and generated outputs. 


e Next-state and output expressions are derived from the circuit. 
e The excitation table corresponding to the next-state and output expressions is derived. 


e A flow table is obtained, associating some (arbitrary) names with the particular encoded 
States. 


e Acorresponding state diagram is derived from the flow table if desired. 


| 9.3. SYNTHESIS OF ASYNCHRONOUS CIRCUITS 


Synthesis of asynchronous sequential circuits follows the same basic steps used to synthesize 
the synchronous circuits, which were discussed in Chapter 6. There are some differences 
due to the asynchronous nature, which make the asynchronous circuits more difficult to 
design. We will explain the differences by investigating a few design examples. The basic 
steps are 

e Devise a state diagram for an FSM that realizes the required functional behavior. 

e Derive the flow table and reduce the number of states if possible. 

e Perform the state assignment and derive the excitation table. 

¢ Obtain the next-state and output expressions. 

e Construct a circuit that implements these expressions. 


When devising a state diagram, or perhaps the flow table directly, it is essential to ensure 
that when the circuit is in a stable state, the correct output signals are generated. Should it 


9.3. SYNTHESIS OF ASYNCHRONOUS CIRCUITS 


be necessary to pass through an unstable state, this state must not produce an undesirable 
output signal. 

Minimization of states is not straightforward. A minimization procedure is described 
in Section 9.4. 

State assignment is not done with the sole purpose of reducing the cost of the final circuit. 
In asynchronous circuits some state assignments may cause the circuit to be unreliable. We 
will explain this problem using the examples that follow. 
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SERIAL PARITY GENERATOR Suppose that we want to design a circuit that has an input 
w and an output z, such that when pulses are applied to w, the output z is equal to 0 if the 
number of previously applied pulses is even and z is equal to | if the number of pulses is 
odd. Hence the circuit acts as a serial parity generator. 

Let A be the state that indicates that an even number of pulses has been received. 
Using the Moore model, the output z will be equal to 0 when the circuit is in state A. As 
long as w = 0, the circuit should remain in A, which is specified by a transition arc that 
both originates and terminates in state A. Thus A is stable when w = 0. When the next 
pulse arrives, the input w = | should cause the FSM to move to a new state, say, B, which 
produces the output z = 1. When the FSM reaches B, it must remain stable in this state as 
long as w = 1. This is specified by a transition arc that originates and terminates in B. The 
next input change occurs when w goes to 0. In response the FSM must change to a state 
where z = | and which corresponds to the fact that a complete pulse has been observed, 
namely, that w has changed from | to 0. Let this state be C; it must be stable under the input 
condition w = 0. The arrival of the next pulse makes w = 1, and the FSM must change 
to a state, D, that indicates that an even number of pulses has been observed and that the 
last pulse is still present. The state D is stable under w = 1, and it causes the output to be 
z = 0. Finally, when w returns to 0 at the end of the pulse, the FSM returns to state A, which 
indicates an even number of pulses and w equal to 0 at the present time. The resulting state 
diagram is shown in Figure 9.13a. 

Akey point to understand is why it is necessary to have four states rather than just two, 
considering that we are merely trying to distinguish between the even and odd number of 
input pulses. States B and C cannot be combined into a single state even though they both 
indicate that an odd number of pulses has been observed. Suppose we had simply tried 
to use state B alone for this purpose. Then it would have been necessary to add an arc 
with a label 0 that originates and terminates in state B, which is fine. The problem is that 
without state C, there would have to be a transition from state B directly to D if the input 
is w = | to respond to the next change in the input when a new pulse arrives. It would be 
impossible to have B both stable under w = | and have a change to D effected for the same 
input condition. Similarly, we can show that the states A and D cannot be combined into a 
single state. 

Figure 9.13b gives the flow table that corresponds directly to the state diagram. In 
many cases the designer can derive a flow table directly. We are using the state diagram 
mostly because it provides a simpler visual picture of the effect of the transitions in an FSM. 

The next step is to assign values to the states in terms of the state variables. Since there 
are four states in our FSM, there have to be at least two state variables. Let these variables 
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(a) State diagram 


Present Next state Output 
state we 0 we 1 x 
A (A) B 0 
B Cc 1 
Cc (C) D 1 
D A (D) 0 


(b) Flow table 


Figure 9.13 —_Parity-generating asynchronous FSM. 


be y; and yj. As a first attempt at the state assignment, let the states A, B, C, and D be 
encoded as yy; = 00, 01, 10, and 11, respectively. This assignment leads to the excitation 
table in Figure 9.14a. Unfortunately, it has a major flaw. The circuit that implements this 
table is stable in state D = 11 under the input condition w = 1. But consider what happens 
next if the input changes to w = 0. According to the excitation table, the circuit should 
change to state A = 00 and remain stable in this state. The problem is that in going from 
yoy; = 11 to y2y; = 00 both state variables must change their values. This is unlikely to 
occur at exactly the same time. In an asynchronous circuit the values of the next-state 
variables are determined by networks of logic gates with varying propagation delays. Thus 
we should expect that one state variable will change slightly before the other, which could 
put the circuit into a state where it may react to the input in an undesirable way. Suppose that 
y, changes first. Then the circuit goes from y2y; = 11 to yzy; = 10. As soon as it reaches 
this state, C, it will attempt to remain there if w = 0, which is a wrong outcome. On the 
other hand, suppose that y. changes first. Then there will be a change from y2y; = 11 to 
y2y, = 01, which corresponds to state B. Since w = 0, the circuit will now try to change to 
yoy; = 10. This again requires that both y; and yy change; assuming that y, changes first 
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Present Next state 
state w-0!lw-— 1 | Output 
y2y1 ene Zz 


00 Ol 0 
01 10 1 
10 ul 1 


ll 00 (11) 0 


(a) Poor state assignment 


Present Next state 
state w=0|we=1 | Output 
21 YY, Zz 


00 01 0 
Ol ll 1 


ul (11) 10 1 


10 00 (10) 0 


(b) Good state assignment 


Figure 9.14 — State assignment for Figure 9.13b. 


in the transition from y2y; = 01, the circuit will find itself in the state y2y; = 00, which is 
the correct destination state, A. This discussion indicates that the required transition from 
DtoA will be performed correctly if y. changes before y,, but it will not work if y; changes 
before y2. The result depends on the outcome of the “race” to change between the signals 
y, and yo. 

The uncertainty caused by multiple changes in the state variables in response to an 
input that should lead to a predictable change from one stable state to another has to be 
eliminated. The term race condition is used to refer to such unpredictable behavior. We 
will discuss this issue in detail in Section 9.5. 

Race conditions can be eliminated by treating the present-state variables as if they were 
inputs to the circuit, meaning that only one state variable is allowed to change at a time. For 
our example the assignment A = 00, B = 01, C = 11, and D = 10 achieves this objective. 
The resulting excitation table is presented in Figure 9.14b. The reader should verify that 
all transitions involve changing a single state variable. 
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From Figure 9.145 the next-state and output expressions are 


Yy = wy, +wyi + yiy2 
Y = wy2 + Wy1 + yiya 
Z=y 
The last product term in the expressions for Y; and Y> is included to deal with possible 
hazards, which are discussed in Section 9.6. The corresponding circuit is shown in Fig- 
ure 9.15. 
It is interesting to consider how the serial parity generator could be implemented using 
a synchronous approach. All that is needed is a single flip-flop that changes its state with 
the arrival of each input pulse. The positive-edge-triggered D flip-flop in Figure 9.16 


Figure 9.15 — Circuit that implements the FSM in Figure 9.13b. 


tt 
ek ae 


Figure 9.16 = Synchronous solution for Example 9.4. 
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accomplishes the task, assuming that the flip-flop is initially set to Q=0. The logic 
complexity of the flip-flop is exactly the same as the circuit in Figure 9.15. Indeed, if we 
use the preceding expressions for Y; and Y> and substitute C for w, D for y, y», for y;, and 
ys for y2, we end up with the excitation expressions shown for the master-slave D flip-flop in 
Example 9.2. The circuit in Figure 9.15 is actually a negative-edge-triggered master-slave 
flip-flop, with the complement of its Q output (v2) connected to its D input. The output z is 
connected to the output of the master stage of the flip-flop. 
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MODULO-4 COUNTER — Chapters 5 and 6 described how counters can be implemented 
using flip-flops. Now we will synthesize a counter as an asynchronous sequential circuit. 
Figure 9.17 depicts a state diagram for a modulo-4 up-counter, which counts the number 
of pulses on an input line, w. The circuit must be able to react to all changes in the input 
signal; thus it must take specific actions at both the positive and negative edges of each 
pulse. Therefore, eight states are needed to deal with the edges in four consecutive pulses. 

The counter begins in state A and stays in this state as long as w = 0. When w changes 
to 1, a transition to state B is made and the circuit remains stable in this state as long as 
w = 1. When w goes back to 0, the circuit moves to state C and remains stable until w 
becomes | again, which causes a transition to state D, and so on. Using the Moore model, 
the states correspond to specific counts. There are two states for each particular count: the 
state that the FSM enters when w changes from 0 to | at the start of a pulse and the state that 
the FSM enters when w goes back to 0 at the end of the pulse. States B and C correspond 
to the count of 1, states D and E to 2, and states F and G to 3. States A and H represent the 
count of 0. 

Figure 9.18 shows the flow and excitation tables for the counter. The state assignment 
is chosen such that all transitions between states require changing the value of only one 


b-8.8-4 
cE 


Figure 9.17 — State diagram for a modulo-4 counter. 
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Present Next state Output 


state w=0 


A 


B 


>@oe®O*2* Qe 
@®@=@Q7=Orv@efi 


(a) Flow table 


Present Next state 
state w=O0!lwel Output Mod-8 output 
Y3 251 Y3¥oY £241 232221 


000 001 00 000 
001 O11 01 001 
O11 010 01 010 
010 110 10 O11 
110 111 10 100 
111 101 (an) 11 101 
101 100 11 110 
100 000 00 111 


(b) Excitation table (c) Output for counting 
the edges 


Figure 9.18 — Flow and excitation tables for a modulo-4 counter. 
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state variable to eliminate the possibility of race conditions. The output is encoded as a 
binary number, using variables zz and z;. From the excitation table the next-state and output 
expressions are 


Y, = wy + wy2y3 + WyoV3 + Yiyoy3 + yiVey3 
= wy, + (Ww + y1) 0293 + Yay3) 


Y2 = wy2 + Wy1y3 + Yiy2 + yay3 


Y3 = wy3 + yiy3 + YyyoW + yoy3 


1 = 1 
22 =yiy3 + Yiy2 


These expressions define the circuit that implements the required modulo-4 pulse counter. 

In the preceding derivation we designed a circuit that changes its state on every edge 
of the input signal w, requiring a total of eight states. Since the circuit is supposed to count 
the number of complete pulses, which contain a rising and a falling edge, the output count 
Z2z, changes its value only in every second state. This FSM behaves like a synchronous 
sequential circuit in which the output count changes only as a result of w changing from 0 
to 1. 

Suppose now that we want to count the number of times the signal w changes its value, 
that is, the number of its edges. The state transitions specified in Figures 9.17 and 9.18 
define an FSM that can operate as a modulo-8 counter for this purpose. We only need to 
specify a distinct output in each state, which can be done as shown in Figure 9.18c. The 
values of z3Z2z, indicate the counting sequence 0, 1,2,..., 7,0. Using this specification 
of the output and the state assignment in Figure 9.18), the resulting output expressions are 


Z1=y1 By2 PB y3 
Z2 = y2 B y3 


23 = Y3 
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A SIMPLE ARBITER —_In computer systems it is often useful to have some resource shared 
by a number of different devices. Usually, the resource can be used by only one device at a 
time. When various devices need to use the resource, they have to request to do so. These 
requests are handled by an arbiter circuit. When there are two or more outstanding requests, 
the arbiter may use some priority scheme to choose one of them, as already discussed in 
Section 6.8. 

We will now consider an example of a simple arbiter implemented as an asynchronous 
sequential circuit. To keep the example small, suppose that two devices are competing 
for the shared resource, as indicated in Figure 9.19a. Each device communicates with the 
arbiter by means of two signals—Request and Grant. When a device needs to use the shared 
resource, it raises its Request signal to 1. Then it waits until the arbiter responds with the 
Grant signal. 

Figure 9.19b illustrates a commonly used scheme for communication between two 
entities in the asynchronous environment, known as handshake signaling. Two signals are 
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Request1 


Grantl Device | 


Shared 


Arbiter resource 


Request2 


Grant2 


(a) Arbitration structure 


Request (r) 


Grant (g) 


(b) Handshake signaling 


Figure 9.19 — Arbitration example. 


used to provide the handshake. A device initiates the activity by raising a request, r = 1. 
When the shared resource is available, the arbiter responds by issuing a grant, g = 1. When 
the device receives the grant signal, it proceeds to use the requested shared resource. When 
it completes its use of the resource, it drops its request by setting r = 0. When the arbiter 
sees that r = 0, it deactivates the grant signal, making g = 0. The arrows in the figure 
indicate the cause-effect relationships in this signaling scheme; a change in one signal 
causes a change in the other signal. The time elapsed between the changes in the cause- 
effect signals depends on the specific implementation of the circuit. A key point is that 
there is no need for a synchronizing clock. 

A State diagram for our simple arbiter is given in Figure 9.20. There are two inputs, the 
request signals r; and r2, and two outputs, the grant signals g; and g2. The diagram depicts 
the Moore model of the required FSM, where the arcs are labeled as rr; and the state 
outputs as gog1. The quiescent state is A, where there are no requests. State B represents the 
situation in which Device | is given permission to use the resource, and state C denotes the 
same for Device 2. Thus B is stable if r2r; = 01, and C is stable if rxr; = 10. To conform 
to the rules of asynchronous circuit design, we will assume that the inputs 7, and ry become 
activated one at a time. Hence, in state A it is impossible to have a change from ryr,; = 00 
to rar; = 11. The situation where r2r; = 11 occurs only when a second request is raised 
before the device that has the grant signal completes its use of the shared resource, which 
can happen in states B and C. If the FSM is stable in either state B or C, it will remain in 
this state if both r; and rz go to 1. 

The flow table is given in Figure 9.21a, and the excitation table is presented in Fig- 
ure 9.21b. It is impossible to choose a state assignment such that all changes between states 
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Poly 


Figure 9.20 — State diagram for the arbiter. 


Present Next state Output 


state | rr7= 00 01 10 11] 8281 


A (A) B Cc = 00 
B A (@) c @| 01 
Cc A B(©Q©| 10 


(a) Flow table 


Present Next state 
state ror) = 00 OL 10° 11 Output 
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00 ol 10 - 00 

01 00 10 01 
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(b) Excitation table 


Figure 9.21 Implementation of the arbiter. 
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A, B, and C involve a change in a single state variable only. In the chosen assignment 
the transitions to or from state A are handled properly, but the transitions between states 
B and C involve changes in the values of both state variables y; and yz. Suppose that the 
circuit is stable in state B under input valuation r2r; = 11. Now let the inputs change to 
rar, = 10. This should cause a change to state C, which means that the state variables must 
change from yy; = 01 to 10. If y, changes faster than y», then the circuit will find itself 
momentarily in state yy; = 00, which leads to the desired final state because from state A 
there is a specified transition to C under the input valuation 10. But if y2 changes faster 
than y,, the circuit will reach the state yy; = 11, which is not defined in the flow table. To 
make sure that even in this case the circuit will proceed to the required destination C, we 
can include the state yoy; = 11, labeled D, in the excitation table and specify the required 
transition as shown in the figure. A similar situation arises when the circuit is stable in C 
under ror; = 11, and it has to change to B when rz changes from | to 0. 

The output values for the extra state D are indicated as don’t cares. Whenever a specific 
output is changing from 0 to | or from | to 0, exactly when this change takes place is not 
important if the correct value is produced when the circuit is in a stable state. The don’t-care 
specification may lead to a simpler realization of the output functions. It is important to 
ensure that unspecified outputs will not result in a value that may cause erroneous behavior. 
From Figure 9.21 it is possible that during the short time when the circuit passes through 
the unstable state D the outputs become g2g; = 11. This is harmless in our example because 
the device that has just finished using the shared resource will not try to use it again until its 
grant signal has returned to 0 to indicate the end of the handshake with the arbiter. Observe 
that if this condition occurs when changing from B to C, then g; remains | slightly longer 
and g2 becomes | slightly earlier. Similarly, if the transition is from C to B, then the change 
in g; from 0 to | happens slightly earlier and gz changes to 0 slightly later. In both of these 
cases there is no glitch on either g; or g2. 

From the excitation table the following next-state and output expressions are derived 


Y, =fari + ry 
Y2 = roF| + roy2 
§l=Yl 
82 = y2 


Rewriting the first two expressions as 


Y, =r (72 + y2) 
= riroyr 
Yy = ro(r, + y2) 


produces the circuit in Figure 9.22. Observe that this circuit responds very quickly to the 
changes in the input signals. This behavior is in sharp contrast to the arbiter discussed in 
Section 6.8 in which the synchronizing clock determines the minimum response time. 
The difficulty with the race condition that arises in state changes between B and C can 
be resolved in another way. We can simply prevent the circuit from reaching an unspecified 
state. Figure 9.23a shows a modified flow table in which transitions between states B and 


ry 


7) 


Figure 9.22 
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The arbiter circuit. 


Present Next state Output 
state | rr, = 00 01 10 11 | 4! 
A (A) B c -| 00 
B A (B) A ol 
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(a) Modified flow table 

Present Next state 

sides! |e SOO: “OL. to an | Oe 
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00 01 10 — | 00 
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10 00 00 10 


Figure 9.23 


(b) Modified excitation table 


Figure 9.21a. 


An alternative for avoiding a critical race in 
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C are made via state A. If the circuit is stable in B and the input valuation changes from 
rar; = 11 to 10, achange to A will occur first. As soon as the circuit reaches A, which is not 
stable for the input valuation 10, it will proceed to the stable state C. The detour through 
the unstable state A is acceptable because in this state the output is g2g; = 00, which is 
consistent with the desired operation of the arbiter. The change from C to B is handled 
using the same approach. From the modified excitation table in Figure 9.23), the following 
next-state expressions are derived 


Y; = TiVo 
Y2 =P iroy, + ry2 


These expressions give rise to a circuit different from the one in Figure 9.22. However, 
both circuits implement the functionality required in the arbiter. 

Next we will attempt to design the same arbiter using the Mealy model specification. 
From Figure 9.20 it is apparent that the states B and C are fundamentally different because 
for the input r27; = 11 they must produce two different outputs. But state A is unique only 
to the extent that it generates the output g2g; = 00 whenever r2r; = 00. This condition 
could be specified in both B and C if the Mealy model is used. Figure 9.24 gives a suitable 
state diagram. The flow and excitation tables are presented in Figure 9.25, which lead to 
the following expressions 


Y=nr,+riytny 
gi=ny 
82 =12y 


Despite needing a single state variable, this circuit requires more gates for implementation 
than does the Moore version in Figure 9.22. 

An important notion in the above examples is that it is necessary to pay careful attention 
to the state assignment, to avoid races in changing of the values of the state variables. Sec- 
tion 9.5 deals with this issue in more detail. 

We made the basic assumption that the request inputs to the arbiter FSM change their 
values one at a time, which allows the circuit to reach a stable state before the next change 
takes place. If the devices are totally independent, they can raise their requests at any time. 
Suppose that each device raises a request every few seconds. Since the arbiter circuit needs 
only a few nanoseconds to change from one stable state to another, it is quite unlikely that 
both devices will raise their requests so close to each other that the arbiter circuit will produce 
erroneous outputs. However, while the probability of an error caused by the simultaneous 
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Figure 9.24 = Mealy model for the arbiter FSM. 
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(b) Excitation table 


Figure 9.25 = Mealy model implementation of the arbiter FSM. 


arrival of requests is extremely low, it is not zero. If this small possibility of an error cannot 
be tolerated, then it is possible to feed the request signals through a special circuit called 
the mutual exclusion (ME) element. This circuit has two inputs and two outputs. If both 
inputs are 0, then both outputs are 0. If only one input is 1, then the corresponding output 
is 1. If both inputs are 1, the circuit makes one output go to | and keeps the other at 0. 
Using the ME element would change the design of the arbiter slightly; because the valuation 
rar, = 11 would never occur, all entries in the corresponding column in Figure 9.21 would 
be don’t cares. The ME element and the issue of simultaneous changes in input signals are 
discussed in detail in reference [6]. Finally, we should note that a similar problem arises 
in synchronous circuits in which one or more inputs are generated by a circuit that is not 
controlled by a common clock, as discussed in Section 7.8.3. 
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In Chapter 6 we saw that reducing the number of states needed to realize the functionality 
of a given FSM usually leads to fewer state variables, which means that fewer flip-flops are 
required in the corresponding synchronous sequential circuit. In asynchronous sequential 
circuits it is also useful to try to reduce the number of states because this usually results in 
simpler implementations. 
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When designing an asynchronous FSM, the initial flow table is likely to have many 
unspecified (don’t-care) entries, because the designer has to obey the restriction that only 
one input variable can change its value at a time. For example, suppose that we want to 
design the FSM for the simple vending machine considered in Example 9.3. Recall that 
the machine accepts nickels and dimes and dispenses candy when 10 cents is deposited; 
the machine does not give change if 15 cents is deposited. An initial state diagram for 
this FSM can be derived in straightforward fashion by enumerating all possible sequences 
of depositing the coins to give a sum of at least 10 cents. Figure 9.26a shows a possible 
diagram, defined as a Moore model. Starting in a reset state, A, the FSM remains in this 
state as long as no coin is deposited. This is denoted by an arc labeled 0 to indicate that 
N = D=0. Now let an arc with the label N denote that the coin-sensing mechanism has 
detected a nickel and has generated a signal N = 1. Similarly, let D denote that a dime has 
been deposited. If N = 1, then the FSM has to move to a new state, say, B, and it must 
remain stable in this state as long as N has the value of 1. Since B corresponds to 5 cents 
being deposited, the output in this state has to be 0. If a dime is deposited in state A, then the 
FSM must move to a different state, say, C. The machine should stay in C as long as D = 1, 
and it should release the candy by generating the output of 1. These are the only possible 
transitions from state A, because it is impossible to insert two coins at the same time, which 
means that DN = 11 can be treated as a don’t-care condition. Next, in state B there must 
be a return to the condition DN = 00 because the coin-sensing mechanism will detect the 
second coin some time after the first coin has cleared the mechanism. This behavior is 
consistent with the requirement that only one input variable can change at a time; hence it 
is not allowed to go from DN = 01 to DN = 10. The input DN = 10 cannot occur in state 
B and should be treated as a don’t care. The input DN = 00 takes the FSM to a new state, 
D, which indicates that 5 cents has been deposited and that there is no coin in the sensing 
mechanism. In state D it is possible to deposit either a nickel or a dime. If DN = O1, the 
machine moves to state E, which denotes that 10 cents has been deposited and generates 
the output of 1. If DN = 10, the machine moves to state F’, which also generates the output 
of 1. Finally, when the FSM is in any of the states C, E, or F, the only possible input is 
DN = 00, which returns the machine to state A. 

The flow table for this FSM is given in Figure 9.26b. It shows explicitly all don’t-care 
entries. Such unspecified entries provide a certain amount of flexibility that can be exploited 
in reducing the number of states. Note that in each row of this table there is only one stable 
state. Such tables, where there is only one stable state for each row, are often referred to as 
primitive flow tables. 

Several techniques have been developed for state reduction. In this section we will 
describe a two-step process. In the first step we will apply the partitioning procedure from 
Section 6.6.1, assuming that the potentially equivalent rows in a flow table must produce 
the same outputs. As an additional constraint, for two rows to be potentially equivalent any 
unspecified entries must be in the same next-state columns. Thus combining the equivalent 
states into a single state will not remove the don’t cares and the flexibility that they provide. 
In the second step, the rows are merged exploiting the unspecified entries. Two rows can be 
merged if they have no conflicting next-state entries. This means that their next-state entries 
for any given valuation of inputs are either the same, or one of them is unspecified, or both 
rows indicate a stable state. If the Moore model is used, then the two rows (states) must 
produce the same outputs. If the Mealy model is used, then the two states must produce the 
same outputs for any input valuations for which both states are stable. 
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(a) Initial state diagram 
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(b) Initial flow table 


Figure 9.26 _ Derivation of an FSM for the simple vending machine. 


We will now show how the flow diagram in Figure 9.26b can be reduced to the optimized Example 9.7 
form in Figure 9.12. The first step in the state-reduction process is the partitioning procedure 
from Section 6.6.1. States A and D are stable under the input valuation DN = 00, producing 
the output of 0; they also have the unspecified entries in the same position. States C and F 
are stable under DN = 10, generating z = 1, and they have the same unspecified entries. 
States B and E have the same unspecified entries, but when they are stable under DN = O1 
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the state B produces z = 0 while E generates z = 1; they are not equivalent. Therefore, the 
initial partition is 


P| = (AD)(B)(CF)(E) 


The successors of A and D are (A, D) for DN = 00, (B, E) for 01, and (C, F) for 10. Since 
the (B, E) pair is not in the same block of P;, it follows that A and D are not equivalent. 
The successors of C and F are (A, A) for 00 and (C, F) for 10; each pair is in a single block. 
Thus the second partition is 


Py = (A)(D)(B)(CF)(E) 
The successors of C and F in P2 are in the same block of P2, which means that 
P3 = P» 


The conclusion is that rows C and F are equivalent. Combining them into a single row and 
changing all F's into Cs gives the flow table in Figure 9.27. 

Next we can try to merge some rows in the flow table by exploiting the existence of 
unspecified entries. The only row that can be merged with others is C. It can be merged 
with either A or E, but not both. Merging C with A would mean that the new state has to 
generate z = 0 when it is stable under the input valuation 00 and has to produce z = 1 when 
stable under 10. This can be achieved only by using the Mealy model. The alternative is to 
merge C and E, in which case the new state is stable under DN = 01 and 10, producing the 
output of 1. This can be achieved with the Moore model. Merging C and EF into a single 
state C and changing all F's into Cs yields the reduced flow table in Figure 9.12. Observe 
that when C and E are merged, the new row C must include all specifications in both rows 
C and E. Both rows specify A as the next state if DV = 00. Row E specifies a stable state 
for DN = 01; hence the new row (called C) must also specify a stable state for the same 
valuation. Similarly, row C specifies a stable state for DN = 10, which must be reflected 
in the new row. Therefore, the next-state entries in the new row are A, , and for the 
input valuations 00, 01, and 10, respectively. 


Present Next state Output 
state = =pDN=00 O1 10 II z 
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Figure 9.27 _First-step reduction of the FSM in Figure 9.26b. 
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Merging Procedure 

In Example 9.7 it was easy to decide which rows should be merged because the only 
possibilities are to merge row C with either A or E. We chose to merge C and E because 
this can be done preserving the Moore model, which is likely to lead to a simpler expression 
that realizes the output z. 

In general, there can be many possibilities for merging rows in larger flow tables. In 
such cases it is necessary to have a more structured procedure for making the choice. A 
useful procedure can be defined using the concept of compatibility of states. 


Definition 9.1 — Two states (rows in a flow table), S; and S;, are said to be compatible if 
there are no state conflicts for any input valuation. Thus for each input valuation, one of 
the following conditions must be true: 


¢ both S; and S; have the same successor, or 
¢ both S; and S; are stable, or 


¢ — the successor of S; or S;, or both, is unspecified. 
Moreover, both S; and S; must have the same output whenever specified. 


Consider the primitive flow table in Figure 9.28. Let us examine the compatibility be- 
tween different states, assuming that we would like to retain the Moore-type specification 
of outputs for this FSM. State A is compatible only with state H. State B is compatible 
with states F and G. State C is not compatible with any other state. State D is compatible 
with state E; so are state F with G and state G with H. In other words, the following com- 
patible pairs exist: (A, H), (B, F), (B, G), (D, E), (F, G), and (G, H). The compatibility 
relationship among various states can be represented conveniently in the form of a merger 
diagram, as follows: 


Present Next state Output 
wow,;= 00 O1 10 I1 z 

A (A) H B - 
F - Cc] 0 
7 (C) 1 


H 
A (@)- E 1 
D 
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state 


G @®] 1 


7a 7m UA DB 
| 


| 
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Figure 9.28 —_A primitive flow table. 
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e Each row of the flow table is represented as a point, labeled by the name of the row. 


e Aline is drawn connecting any two points that correspond to compatible states (rows). 


From the merger diagram the best merging possibility can be chosen, and the reduced flow 
table can be derived. 

Figure 9.29 gives the merger diagram for the primitive flow table in Figure 9.28. The 
diagram indicates that row A can be merged with H, but only if H is not merged with G, 
because there is no line joining A and G. Row B can be merged with rows F and G. Since 
it is also possible to merge F and G, it follows that B, F, and G are pairwise compatible. 
Any set of rows that are pairwise compatible for all pairs in the set can be merged into a 
single state. Thus states B, F, and G can be merged into a single state, but only if states G 
and H are not merged. State C cannot be merged with any other state. States D and E can 
be merged. 

A prudent strategy is to merge the states so that the resulting flow table has as few states 
as possible. In our example the best choice is to merge the compatibles (A, H), (B, F, G), 
and (D, E), which leads to the reduced flow table in Figure 9.30. When a new row is 


H G F 


Figure 9.29 —= Merger diagram for the flow table in Figure 9.28, which 
preserves the Moore model. 
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Figure 9.30 — Reduced Moore-type flow table for the FSM in 
Figure 9.28. 
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created by merging two or more rows, all entries in the new row have to be specified to 
cover the individual requirements of the constituent rows. Replacing rows A and H witha 
new row A requires making A stable for both w2w; = 00 and 01, because the old A has to 
be stable for 00 and # has to be stable for 01. It also requires specifying B as the next-state 
for w2w,; = 10 and E as the next state for wow; = 11. Since the old state E becomes D, 
after merging D and E, the new row A must have the next-state entries : , B, and 
D for the input valuations 00, 01, 10, and 11, respectively. Replacing rows B, F, and G 
with a new row B requires making B stable for w2w,; = 00 and 10. The next-state entry 
for w2w; = O1 has to be D to satisfy the requirement of the old state F. The next-state 
entry for w2w; = 11 has to be C, as dictated by the old state B. Observe that the old state G 
imposes no requirements for transitions under w2w; = 01 and 11, because its corresponding 
next-state entries are unspecified. Row C remains the same as before except that the name 
of the next-state entry for w2w; = 01 has to be changed from H to A. Rows D and E are 
replaced by a new row D, using similar reasoning. Note that the flow table in Figure 9.30 
is still of Moore type. 

So far we considered merging only those rows that would allow us to retain the Moore- 
type specification of the FSM in Figure 9.28. If we are willing to change to the Mealy 
model, then other possibilities exist for merging. Figure 9.31 shows the complete merger 
diagram for the FSM of Figure 9.28. Black lines connect the compatible states that can 
be merged into a new state that has a Moore-type output; this corresponds to the merger 
diagram in Figure 9.29. Blue lines connect the states that can be merged only if Mealy-type 
outputs are used. 

In this case going to the Mealy model is unlikely to result in a simpler circuit. Although 
several merger possibilities exist, they all require at least four states in the reduced flow 
table, which is not any better than the solution obtained in Figure 9.30. For example, one 
possibility is to perform the merge based on the partition (A, H), (B,C, G) (D, E) (F). 


G F 


Figure 9.31 = Complete merger diagram for Figure 9.28. 
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Another possibility is to use (A, C) (B, F) (D, E) (G, H). We will not pursue these possi- 
bilities and will discuss the issues involved in specifying the Mealy-type outputs in Exam- 
ple 9.9. 

State Reduction Procedure 

We can summarize the steps needed to generate the reduced flow table from a primitive 
flow table as follows: 
1. Use the partitioning procedure to eliminate the equivalent states in a primitive flow 

table. 

Construct a merger diagram for the resulting flow table. 


Choose subsets of compatible states that can be merged, trying to minimize the 
number of subsets needed to cover all states. Each state must be included in only one 
of the chosen subsets. 

Derive the reduced flow table by merging the rows in chosen subsets. 


5. Repeat steps 2 to 4 to see whether further reductions are possible. 
Choosing an optimal subset of compatible states for merging can be a very complicated 


task because for large FSMs there may be many possibilities that should be investigated. A 
trial-and-error approach is a reasonable way to tackle this problem. 


Example 9.8 


Consider the initial flow table in Figure 9.32. To apply the partitioning procedure, we 
identify state pairs (A, G), (B, L), and (H, K) as being potentially equivalent rows, because 
both rows in each pair have the same outputs and their don’t-care entries are in the same 
column. The remaining rows are distinct in this respect. Therefore, the first partition is 


P, = (AG)(BL)(C)(D)(E)(F) (AK) (J) 


Now the successors of (A, G) are (A, G) for wow, = 00, (F, B) for 01, and (C, J) for 10. 
Since F and B, as well as C and J, are not in the same block, it follows that A and G are 
not equivalent. The successors of (B, L) are (A, A), (B, L), and (H, K), respectively. All 
are in single blocks. The successors of (H, K) are (L, B), (E, E), and (H, K), which are all 
contained in single blocks. Therefore, the second partition is 


Pz = (A)(G)(BL)(C)(D)(E)(F)(AK) J) 


Repeating the successor test shows that the successors of (B, L) and (H, K) are still in single 
blocks; hence 


P; = Py 


Combining rows B and L under the name B and rows H and K under the name # leads to 
the flow table in Figure 9.33. 

A merger diagram for this flow table is given in Figure 9.34. It indicates that rows B 
and H should be merged into one row, which we will label as B. The merger diagram also 
suggests that rows D and E should be merged; we will call the new row D. The remaining 
rows present more than one choice for merging. Rows A and F can be merged, but in that 
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Present Next state Output 
state | wow; = 00 O1 10 11 z 
A (A) F c - 0 
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Cc c=. (Db) «0 
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L AG) = K 1 
Figure 9.32 — Flow table for Example 9.8. 
Present Next state Output 
state | wow,=00 O1 10 II z 
A (A) F c - 0 
A - 4H 1 
fe @ = {¢)'D|, 0 
D - F - @)| 1 
E G - @® oD 1 
F -@®- 8 0 
G (GB. a = 0 
H - B E (@| 1 
J G - G) -| 0 
Figure 9.33 — Reduction obtained by using the partitioning 


procedure. 
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i e———————————_-e (=) 


H F J G 


Figure 9.34 = Merger diagram for Figure 9.33. 
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Figure 9.35 — Reduction obtained from the merger diagram in 


Figure 9.34. 
A B D Cc G 
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Figure 9.36 = Merger diagram for Figure 9.35. 


case F and J cannot be merged. Rows C and J can be merged, or G and J can be merged. 
We will choose to merge the rows A and F into a new row called A and rows G and J into 
a new row G. The merger choice is indicated in blue in the diagram. The resultant flow 
table is shown in Figure 9.35. To see whether this table offers any further opportunities 
for merging, we can construct the merger diagram in Figure 9.36. From this diagram it is 
apparent that rows C and G can be merged; let the new row be called C. This leads to the 


flow table in Figure 9.37, which cannot be reduced any more. 


9.4 STATE REDUCTION 


Present Next state Output 


wow, =00 O01 10 11] & 

(a)@ c Bl] o 
B A (@) D @| 1 
Cc Cr Be OD), a 
D c A MO) 1 


Figure 9.37 —_ Reduced flow table for Example 9.8. 
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B D| o 
C F(c)- 4H 1 
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Figure 9.38 — Flow table for Example 9.9. 
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Consider the flow table in Figure 9.38. Applying the partitioning procedure to this table 
gives 


P; = (AFK)(BJ)(CG)(D)(E) (A) 
Pz = (A)(FK)(BJ)(C)(G)(D)(E) (A) 
P3 = Po 


Combining B and J into a new state B, and F and K into F, gives the flow table in Fig- 
ure 9.39. 

Figure 9.40a gives a merger diagram for this flow table, indicating the possibilities for 
merger if the Moore model of the FSM is to be preserved. In this case B and F can be 
merged, as well as C and H, resulting in a six-row flow table. 
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Present Next state 


state wow,;= 00 O1 10 11 “ 
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Figure 9.39 — Reduction resulting from the partitioning procedure. 
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(a) Preserving the Moore model 


(b) Complete merger diagram 


Figure 9.40 = Merger diagrams for Figure 9.39. 
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Next we should consider the merging possibilities if we are willing to change to the 
Mealy model. When going from the Moore model to the Mealy model, a stable state in 
the Mealy model must generate the same output as it had in the Moore model. It is also 
important to ensure that transitions in the Mealy model will not produce undesirable glitches 
in the output signal. 

Figure 9.41 indicates how the FSM of Figure 9.39 can be represented in the Mealy 
form. The next-state entries are unchanged. In Figure 9.41, for each stable state the output 
value must be the same as for the corresponding row of the Moore-type table. For example, 
z = 0 when the state A is stable under w2w; = 00. Also, z = 0 when the states B, D, and F 
are stable under w2w; = 10, 11, and 00, respectively. Similarly, z = 1 when C, E, G, and 
FH are stable under wow; = 01, 10, 01, and 11, respectively. If a transition from one stable 
state to another requires the output to change from 0 to 1, or from | to 0, then the exact 
time when the change takes place is not important, as we explained in Section 9.1 when 
discussing Figure 9.3. For instance, suppose that the FSM is stable in A under w2w,; = 00, 
producing z = 0. If the inputs then change to w.w; = O1, a transition to state G must be 
made, where z = 1. Since it is not essential that z becomes | before the circuit reaches the 
state G, the output entry in row A that corresponds to this transition can be treated as a don’t 
care; therefore, it is left unspecified in the table. From the stable state A, it is also possible 
to change to E, which allows specifying another don’t care because z changes from 0 to 1. 
A different situation arises in row B. Suppose that the circuit is stable in B under w2w; = 10 
and that the inputs change to 11. This has to cause a change to stable state D, and z must 
remain at 0 throughout the change in states. Hence the output in row B under ww, = 11 is 
specified as 0. If it were left unspecified, to be used as a don’t care, then it is possible that 
in the implementation of the circuit this don’t care may be treated as a 1. This would cause 
a glitch in z, which would change 0 > 1 — 0 as the circuit moves from B to D when the 
inputs change from 10 to 11. The same situation occurs for the transition from B to F when 
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Figure 9.41 The FSM of Figure 9.39 specified in the form of the 
Mealy model. 
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the inputs change from 10 to 00. We can use the same reasoning to determine other output 


entries in Figure 9.41. 


From Figure 9.41 we can derive the merger diagram in Figure 9.40b. The blue lines 
connect the rows that can be merged only by specifying the output in the Mealy style. The 
black lines connect the rows that can be merged even if the outputs are of Moore type; 
they correspond to the diagram in Figure 9.40a. Choosing the subsets of compatible states 
(A, H), (B,G), (C, F), and (D, E), the FSM can be represented using only four states. 
Merging the states A and Hf into a new state A, states B and G into B, states C and F into 
C, and D and E into D, we obtain the reduced flow table in Figure 9.42. Each entry in this 
table meets the requirements specified in the corresponding rows that were merged. 


Example 9.10 As another example consider the flow table in Figure 9.43. The partitioning procedure gives 


P = (AF)(BEG)(C)(D)(A) 
P, = (AF)(BE)(G)(C)(D)(#) 


P3 = Pp 
Present Next state Output z 
state | wow,=00 O01 10 11/00 O1 10 11 
(4) B D(@ jo - 1 1 
z C D0 1 0 0 
_ ©@Q B Ajo 1 01 
" Ac @M@/- - 1 0 


Figure 9.42 


Reduced flow table for Example 9.9. 


Present Next state Output 
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Figure 9.43 


Flow table for Example 9.10. 


Replacing state F with A, and state E with B, results in the flow table in Figure 9.44. The 
corresponding merger diagram is presented in Figure 9.45. It is apparent that states A, B, 
and C can be merged and replaced with a new state A. Also D, G, and H can be merged 
into a new state D. The result is the reduced flow table in Figure 9.46, which has only two 
rows. Again we have used the Mealy model because the merged stable states D and H have 


z = | while G has z = 0. 
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Present Next state Output 
state | wow, = 00 O1 10 It z 
A (A) B Cc - 0 
B A @) = 0 
Cc A - © H| 0 
D (D) G c - 1 
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Figure 9.44 


Figure 9.45 


A G 


Merger diagram for Figure 9.44. 


Reduction after the partitioning procedure. 
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Figure 9.46 


Reduced flow diagram for Example 9.10. 
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9.5 STATE ASSIGNMENT 


The examples in Section 9.3 illustrate that the state assignment task for asynchronous 
FSMs is complex. The time needed to change the value of a state variable depends on 
the propagation delays in the circuit. Thus it is impossible to ensure that a change in the 
values of two or more variables will take place at exactly the same time. To achieve reliable 
operation of the circuit, the state variables should change their values one at a time in 
controlled fashion. This is accomplished by designing the circuit such that a change from 
one state to another entails a change in one state variable only. 

States in FSMs are encoded as bit strings that represent different valuations of the state 
variables. The number of bit positions in which two given bit strings differ is called the 
Hamming distance between the strings. For example, for bit strings 0110 and 0100 the 
Hamming distance is 1, while for 0110 and 1101 it is 3. Using this terminology, an ideal 
state assignment has a Hamming distance of | for all transitions from one stable state to 
another. When the ideal state assignment is not possible, an alternative that makes use of 
unspecified states and/or transitions through unstable states must be sought. Sometimes it 
is necessary to increase the number of state variables to provide the needed flexibility. 


Example 9.11 


Consider the parity-generating FSM in Figure 9.13. Two possible state assignments for this 
FSM are given in Figure 9.14. The transitions between states, as specified in Figure 9.13b, 
can be described in pictorial form as shown in Figure 9.47. Each row of the flow table is 
represented by a point. The four points needed to represent the rows are placed as vertices 
of a square. Each vertex has an associated code that represents a valuation of the state 
variables, y2y,. The codes shown in the figure, with yxy; = 00 in the lower-left corner and 
so on, correspond to the coordinates of the two-dimensional cube presented in Section 8.3.1. 
Figure 9.47a shows what happens if the state assignment in Figure 9.14a is used; namely, 
if A = 00, B = 01, C = 10, and D = 11. There is a transition from A to B if w = 1, which 
requires a change in y, only. A transition from C to D occurs if w = 1, which also requires 
a change in y; only. However, a transition from B to C caused by w = 0 involves a change 
in the values of both y2 and y;. Similarly, both state variables must change in going from 
DtoA if w = 0. Achange in both variables corresponds to a diagonal path in the diagram. 

Figure 9.47b shows the effect of the state assignment in Figure 9.14b, which reverses 
the valuations assigned to C and D. In this case all four transitions are along the edges 
of the two-dimensional cube, and they involve a change in only one of the state variables. 
This is the desirable state assignment. 


Example 9.12 


The flow table for an arbiter FSM is given in Figure 9.21a. Transitions for this FSM are 
shown in Figure 9.48a, using the state assignment A = 00, B = O01, and C = 10. In this case 
multiple transitions are possible between the states. For example, there are two transitions 
between A and B: from B to A if r2r7; = 00 and from A to B if r2r; = 01. Again there is a 
diagonal path, corresponding to transitions between B and C, which should be avoided. A 
possible solution is to introduce a fourth state, D, as indicated in Figure 9.48b. Now the 
transitions between B and C can take place via the unstable state D. Thus instead of going 
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A = 00 B = 01 


(b) Corresponding to Figure 9.146 


Figure 9.47 _ Transitions in Figure 9.13. 


directly from B to C when rr; = 10, the circuit will go first from B to D and then from D 
to C. 

Using the arrangement in Figure 9.48b requires modifying the flow table as shown 
in Figure 9.49. The state D is not stable for any input valuation. It cannot be reached if 
rar; = 00 or 11; hence these entries are left unspecified in the table. Also observe that we 
have specified the output g2g; = 10 for state D, rather than leaving it unspecified. When 
a transition from one stable state to another takes place via an unstable state, the output of 
the unstable state must be the same as the output of one of the two stable states involved 
in the transition to ensure that a wrong output is not generated while passing through the 
unstable state. 

It is interesting to compare this flow table with the excitation table in Figure 9.21b, 
which is also based on using the extra state D. In Figure 9.21b the state D specifies the 
necessary transitions should the circuit accidentally find itself in this state as a result of a 
race in changing the values of both state variables. In Figure 9.49 the state D is used in 
orderly transitions, which are not susceptible to any race conditions. 
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C = 10 


00 | 


— 


A = 00 a B = 01 


(b) Using the extra state D 


Figure 9.48 Transitions for the arbiter FSM in Figure 9.21. 
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Figure 9.49 — Modified flow table based on the transitions in 
Figure 9.48b. 
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9.5.1 TRANSITION DIAGRAM 


A diagram that illustrates the transitions specified in a flow table is called a transition dia- 
gram. In some books such diagrams are called state-adjacency diagrams. These diagrams 
provide a convenient aid in searching for a suitable state assignment. 

A good state assignment results if the transition diagram does not have any diagonal 
paths. A general way of stating this requirement is to say that it must be possible to embed 
the transition diagram onto a k-dimensional cube, because in a cube all transitions between 
adjacent vertices involve the Hamming distance of 1. Ideally, a transition diagram for an 
FSM with n state variables can be embedded onto an n-dimensional cube, as is the case in 
the examples in Figures 9.47b and 9.48b. If this is not possible, then it becomes necessary 
to introduce additional state variables, as we will see in later examples. 

The diagrams in Figures 9.47 and 9.48 present all information pertinent to transitions 
between the states in the given FSMs. For larger FSMs such diagrams take on a cluttered 
appearance. A simpler form can be used instead, as described below. 

A transition diagram has to show the state transitions for each valuation of the input 
variables. The direction of a transition, for example from A to B or from B to A, is not 
important, because it is only necessary to ensure that all transitions involve the Hamming 
distance of 1. The transition diagram has to show the effect of individual transitions into 
each stable state, which may involve passing through unstable states. For a given row of a 
flow table, it is possible to have two or more stable-state entries for different input valuations. 
It is useful to identify the transitions leading into these stable states with distinct labels in a 
transition diagram. To give each stable-state entry a distinct label, we will denote the stable- 
state entries with numbers 1, 2,3, .... Thus if state A is stable for two input valuations, we 
will replace the label A with | for one input valuation and with 2 for the other valuation. 

Figure 9.50 shows a relabeled version of the flow table in Figure 9.21a. We have 
arbitrarily chosen_to label (A) as 1, the two appearances of as 2 and 3, and the two 
appearances of as 4 and 5. All entries in each next-state column are labeled using this 
scheme. The transitions identified by these labels are presented in Figure 9.51a. The same 
information is given in Figure 9.48a. Actually, the diagram in Figure 9.48a contains more 
information because arrowheads show the direction of each transition. Note also that the 
edges in that diagram are labeled with input values rr, whereas the edges in Figure 9.51a 
are labeled with numerical stable-state labels as explained above. 
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Figure 9.50 _Relabeled flow table of Figure 9.21. 
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Cc = 10 
1,4 2,4 
A = 00 1,2 B = 01 


(a) Transitions in Figure 9.50 


C = 10 
1,4,2 2,4, 1 
A = 00 1,2,4 B = 01 


(b) Complete transition diagram 


C = 10 
1,4,2 
A = 00 1,2,4 B = 01 


(c) Selected transition diagram 


Figure 9.51 Transition diagrams for Figure 9.50. 


Figure 9.50 indicates that the stable state 2, which is one instance of the stable state 
B, can be reached either from state A or from state C. There is a corresponding label 2 on 
the paths connecting the vertices in the diagram in Figure 9.51a. The difficulty from the 
state-assignment point of view is that the path from C to B is diagonal. In Example 9.12 
this problem was resolved by introducing a new state D. By examining the flow table in 
Figure 9.50 more closely, we can see that the functional behavior of the required arbiter 
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FSM can be achieved if the transition from C to B takes place via state A. Namely, if the 
circuit is stable in C, then the input 727; = 01 can cause the change to A, from which the 
circuit immediately proceeds to state B. We can indicate the possibility of using this path 
by placing the label 2 on the edge that connects C and A in Figure 9.51a. 

Asimilar situation exists for the transition from B to C, whichis labeled 4. An alternative 
path can be realized by causing the circuit to go from state B to state A if rr; = 10 and 
then immediately proceed to C. This can be indicated by placing the label 4 on the edge 
that connects B and A in Figure 9.51a. 

A possibility of having an alternative path for a transition exists whenever two states 
have the same uncircled label in the relabeled flow diagram. In Figure 9.50 there is a 
third such possibility if r2r; = 00, using the label 1. This possibility is not useful because 
changing from either B or C to A involves a change in only one state variable using the 
state assignment in Figure 9.51a. Hence there would be no benefit in having a transition 
between B and C for this input valuation. 

To depict the possibility of having alternative paths, we will indicate in blue the cor- 
responding transitions on the diagram. Thus a complete transition diagram will show all 
direct transitions to stable states in black and possible indirect transitions through unstable 
states in blue. Figure 9.51b shows the complete transition diagram for the flow table in 
Figure 9.21a. 

The transition diagram in Figure 9.51b cannot be embedded on the two-dimensional 
cube, because some transitions require a diagonal path. The blue label 1 on the path between 
B and C is of no concern, because it represents only an alternative path that does not have 
to be used. But the transitions between B and C labeled 2 and 4 are required. The diagram 
shows an alternative path, through A, having the labels 2 and 4. Therefore, the alternative 
path can be used, and the diagonal connection in the diagram can be eliminated. This leads 
to the transition diagram in Figure 9.51c, which can be embedded on the two-dimensional 
cube. The conclusion is that the state assignment A = 00, B = 01, and C = 10 is good, but 
the flow table must be modified to specify the transitions through alternative paths. The 
modified table is the same as the flow table designed earlier using an ad hoc approach, 
shown in Figure 9.23a. 

As a final comment on this example, note the impact of alternative paths on the out- 
puts produced by the FSM. If mr; = 01, then a change from a stable state C through 
unstable A to stable B generates the outputs g.g; = 10 — 00 — O1, rather than 10 — O01 
as specified in Figure 9.21a. For the arbiter FSM this presents no problem, as explained in 
Example 9.6. 


Procedure for Deriving Transition Diagrams 


The transition diagram is derived from a flow table as follows: 


e Derive the relabeled flow table as explained above. For a given input valuation, all 
transitions that lead to the same stable state are labeled with the same number. Tran- 
sitions through unstable states that eventually lead to a stable state are given the same 
number as the stable-state entry. 


e Represent each row of the flow table by a vertex. 
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e Join two vertices, V; and V;, by an edge if they have the same number in any column 
of the relabeled flow table. 


e For each column in which V; and V; have the same number, label the edge between 
VY; and V; with that number. We will use black labels for direct transitions to circled 
(stable) states and blue labels when the next-state entries for both V; and V; in the flow 
table are uncircled. 


Note that the first point says that in the relabeled flow table the transitions through unstable 
states are given the label of the stable state to which they lead for a given input valuation. 
For example, to derive a transition diagram starting from the flow table in Figure 9.23a, 
the table would be relabeled to give the table in Figure 9.50. The transition from stable A 
to stable B, when r2r; = 01, has the label 2. The same label is given to the transition from 
stable C to unstable A because this transition ultimately leads to stable B. 


9.5.2 EXPLOITING UNSPECIFIED NEXT-STATE ENTRIES 


Unspecified entries in a flow table provide some flexibility in finding good state assignments. 
The following example presents a possible approach. The example also illustrates all steps 
in the derivation of a transition diagram. 


Example 9.13 


Consider the flow table in Figure 9.52a. This FSM has seven stable-state entries. Labeling 
these entries in order, from | to 7, results in the table in part (b) of the figure. In this case 
states | and 2 correspond to state A, 3 and 4 to state B, 5 and 6 to state C, and 7 to state 
D. In the column w2w, = 00 there is a transition from C to A, which is labeled 1, and a 
transition from D to B, which is labeled 3, because 1 and 3 are the successor stable states 
in these transitions. Similarly, in column 11 there are transitions from B to C and from D 
to A, which are labeled 6 and 2, respectively. In column 01 there is a transition from A to 
B, which is labeled 4. State C is stable for this input valuation; it is labeled 5. There is no 
transition specified that leads to this stable state. The state can be reached only if C is stable 
under w2w; = 11, which is labeled 6, and then the inputs change to w2w; = O01. Note that 
the FSM remains stable in C if the inputs change from 11 to 01, or vice versa. Column 
10 illustrates how unstable states are treated. From the stable state A, a transition to the 
unstable state C is specified. As soon as the FSM reaches state C, it proceeds to change to 
the stable state D, which is labeled 7. Thus 7 is used as the label for the entire transition 
sequence from A to C to D. 

Taking rows A, B, C, and D as the four vertices, a first attempt at drawing the transition 
diagram is given in Figure 9.53a. The diagram shows transitions between all pairs of states, 
which seems to suggest that it is impossible to have a state assignment where all transitions 
are characterized by a Hamming distance of 1. If the state assignment A = 00, B = O01, 
C = 11, and D = 10 is used, then the diagonal transition between A and C, or B and D, 
requires both state variables to change their values. The diagonal path from B to D with 
the label 7 is not needed, because an alternative path from B to D exists under label 7 that 
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Present Next state Output 


state | wow,=00 O1 10 1 | 22%! 
A (A) B Cc (A)) 00 
B Dc] ol 
G A (©) D ©] 10 
D B - (D) A 


(a) Flow table 


Present Next state 


state | wow,=00 OL 10 11 | 22%! 


A C1) 4 


B 
Cc 1 (5) 
D 


(b) Relabeled flow table 


Figure 9.52 — Flow tables for Example 9.13. 


passes either through state A or through state C. Unfortunately, the diagonal paths labeled 
1 and 3 cannot be removed, because there are no alternative paths for these transitions. 

As the next attempt at finding a suitable state assignment, we will reverse the codes 
given to B and C, which yields the transition diagram in Figure 9.53b. Now the same 
argument about the alternative paths labeled 7 indicates that the diagonal from C to D can 
be omitted. Also, the label 7 on the diagonal between A and B can be omitted. However, this 
diagonal must remain because of the label 4 for which there is no alternative path between A 
and B. Looking at the flow table in Figure 9.52b, we see an unspecified entry in the column 
w2w, = 01. This entry can be exploited by replacing it with the label 4, in which case the 
transition graph would show the label 4 on the edges connecting A and D, as well as B and 
D. Thus the diagonal between A and B could be removed, producing the transition diagram 
in Figure 9.53c. This diagram can be embedded on a two-dimensional cube, which means 
that the state assignment A = 00, B = 11, C = 01, and D = 10 can be used. 

For the transition diagram in Figure 9.53c to be applicable, the flow table for the FSM 
must be modified as shown in Figure 9.54a. The unspecified entry in Figure 9.52a now 
specifies a transition to state B. According to Figure 9.53c, the change from state A to B 
under input valuation w.w, = 01 must pass through state D; hence the corresponding entry 
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D = 10 7 C= 11 
2,7 6,7 

A = 00 4,7 B = Ol 
(a) First transition diagram 

D = 10 3,7 B= 11 
2,7 6, 7 

A = 00 1,7 C=0l1 


(b) Second transition diagram 


D = 10 3, 7,4 B = ll 
2,7,4 6,7 
A = 00 1,7 C=01 


(c) Augmented transition diagram 


Figure 9.53 Transition diagrams for Figure 9.52. 


in the first row is modified to ensure that this will take place. Also, when w2w, = 10, the 
FSM must go to state D. If it happens to be in state C, then this change has to occur either 
via state A or state B. We have chosen the path via state B in Figure 9.54a. 

The original flow table in Figure 9.52a is defined in the form of the Moore model. 
The modified flow table in Figure 9.54a requires the use of the Mealy model because 
the previously described transitions through unstable states must produce correct outputs. 
Consider first the change from A if w2w; = 01. While stable in state A, the circuit must 
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Present Next state Output zz 


state | wow,=00 O01 10 11] 00 O01 10 


(A) D D (A)| 00 00 11 00 
(B)(B) D clo of nu ol 
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(a) Modified flow table 


Present Next state Output 


state waw;= 00 OL 10 114] 00 Ol 10. «11 


Y2y1 YnY) 224 

00 10 10 00 00 11 00 
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Ol 00 ll -0 10 1- 10 
10 it ii 00}-1 0- 00 


(b) Excitation table 


G9 AQ Bw D> 


Figure 9.54 _ Realization of the FSM in Figure 9.52a. 


produce the output z2z; = 00. Upon reaching the stable state B, the output must become 
01. The problem is that this transition requires a short visit to state D, which in the Moore 
model would produce z2z; = 11. Thus a glitch would be generated on the output signal zo, 
which would undergo the change 0 — | — 0. To avoid this undesirable glitch, the output 
in state D must be z) = 0 for this input valuation, which requires the use of the Mealy model 
as shown in the Figure 9.54a. Observe that while z. must be 0 in D for w2w; = O1, z; can 
be either 0 or | because it is changing from 0 in state A to | in state B. Therefore, z; can be 
left unspecified so that this case can be treated as a don’t-care condition. A similar situation 
arises when the circuit changes from C to D via B if w2w; = 10. The output must change 
from 10 to 11, which means that z2 must remain at | throughout this change, including the 
short time in state B where the Moore model output would be 01. 

The modified flow table and the chosen state assignment lead to the excitation table in 
Figure 9.54b. From this table the next-state and output expressions are derived, as in the 
examples in Section 9.3. 
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9.5.3 STATE ASSIGNMENT USING ADDITIONAL STATE VARIABLES 


In Figure 9.52a there is an unspecified transition that can be exploited to find a suitable 
state assignment, as shown in Section 9.5.2. In general, such flexibility may not exist. It 
may be impossible to find a race-free state assignment using logyn state variables for a flow 
table that has n rows. The problem can be solved by adding extra state variables. This can 
be done in three ways, as illustrated in the examples that follow. 


USING EXTRA UNSTABLE STATES Consider the FSM specified by the flow table in Fig- 
ure 9.55a. The flow table is relabeled in part (b) of the figure. A corresponding transition 
diagram is depicted in Figure 9.56a. It indicates that there are transitions between all pairs 
of vertices (rows). No rearrangement of the existing vertices would allow mapping of the 
transition diagram onto a two-dimensional cube. 

Let us now introduce one more state variable so that we can look for a way to map the 
transition diagram onto a three-dimensional cube. With three state variables the assignment 
for state A can be a Hamming distance of | different from the assignments for B, C, 


Present Next state Output 
state | wow; = 00 O1 10 1] %2%1 
A (A) (A) c B| 00 
B A D Ol 
Cc (Cc) B © D| 10 
D c A MO) u 

(a) Flow table 

Present Next state Output 
state | wow, = 00 O1 10 I] 42%) 
A G@)@) 6 4] 00 
B 12) 7@|] 
@ (3)3 Ge) S|) to 
D S27 ia 


(b) Relabeled flow table 


Figure 9.55 — FSM for Example 9.14. 
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A 1,4 B 
DX} 
D 5,8 C 


(a) Transition diagram 


A 1,4 B 


3 
2 F 
3 
D 5.8 G 5.8 C 


(b) Augmented transition diagram 


(c) Embedded transition diagram 


Figure 9.56 Transition diagrams for Figure 9.55. 


and D. For example, we could have A = 000, B = 001, C = 100, and D = 010. But it 
would then be impossible to have the pairs (B, C), (B, D), and (C, D) within the Hamming 
distance of 1. The solution here is to insert extra vertices in the transition paths, as shown in 
Figure 9.56b. Vertex E separates B from D, while vertices F' and G break the paths (B, C) 
and (C, D). The labels associated with the transitions are attached to both segments of a 
broken path. The resulting transition diagram can be embedded onto a three-dimensional 
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cube as indicated in Figure 9.56c, where the black portion of the cube comprises the desired 
paths. Now the transition from B to D takes place via vertex E if w2w; = 10 (label 7). The 
transition from C to B occurs via F if w2w; = 01 (label 3). The transition from C to D goes 
through G if w2w; = 11 (label 8), and the transition from D to C goes via G if wow; = 00 
(label 5). Therefore, the flow table has to be modified as shown in Figure 9.57a. The three 
extra states are unstable because the circuit will not remain in these states for any valuation 


Present Next state Output 
state | wow, = 00 OL 10 11) 42%! 


A (A) (A) Cc B] 00 
B A (8) E @)| ol 
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(a) Modified flow table 


Present Next state 
state wow, = 00 O01 10 11 | Output 
y3N2Y1 Y3Y>Y, ZZ] 


A 000 100 001 00 


B | 001 000 O11 01 
Cc} 100 101 110] 10 
D| 010 110 000 fi 
E| Ol 2 = Ue -1 
F/ 101 2 OOP = 2 01 
G| 110 100 - - oo] 1- 


(b) Excitation table 


Figure 9.57 = Modified tables for Example 9.14. 


9.5 STATE ASSIGNMENT 


of the inputs. The circuit will merely pass through these states in the process of changing 
from one stable state to another. Observe that each of the states E, F, and G is needed 
to facilitate the transitions caused by only one or two valuations of inputs. Thus it is not 
necessary to specify the actions that might be caused by other input valuations, because 
such situations will never occur in a properly functioning circuit. 

The outputs in Figure 9.57a can be specified using the Mealy model. It is essential 
that a proper output is generated when passing through unstable states, to avoid undesirable 
glitches in the output signals. 

If we assign the state variables as shown on the right of Figure 9.56c, the modified flow 
table leads to the excitation table in Figure 9.57b. From this table, deriving the next-state 
and output expressions is a straightforward task. 
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USING PAIRS OF EQUIVALENT STATES Another approach is to increase the flexibility in 
state assignment by introducing an equivalent new state for each existing state. Thus state 
A can be replaced with two states Al and A2 such that the final circuit produces the same 
outputs for Al and A2 as it would for A. Similarly, other states can be replaced by equivalent 
pairs of states. Figure 9.58 shows how a three-dimensional cube can be used to find a good 
state assignment for a four-row flow table. The four equivalent pairs are arranged so that the 
minimum Hamming distance of 1 exists between all pairs. For example, the pair (B1, B2) 
has the Hamming distance of 1 with respect to Al (or A2), C2, and D2. 

The transition diagram in Figure 9.56a can be embedded onto the three-dimensional 
cube as shown in Figure 9.58. Since there is a choice of two vertices on the cube for each 
vertex in the transition diagram in Figure 9.56a, the embedded transition diagram does not 
involve any diagonal paths. Using this assignment of states, the flow table in Figure 9.55a 
has to be modified as presented in Figure 9.59a. The entries in the table are made to 
allow each transition in the original flow table to be realized using a transition between 
the corresponding pairs of equivalent states. Both states in an equivalent pair are stable 
for the input valuations for which the original state is stable. Thus Al and A2 are stable if 
w2w, = 00 or 01, BI and B2 are stable if w2w; = 01 or 11, and so on. At any given time 


Bl 
1,4 y2 
5,8 D1 
Le Ze, 
2 yi, 
Al A2 


Figure 9.58 — Embedded transition diagram if two nodes per row 
are used. 
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Present Next state Output 


state | wow,=00 01 10 11 | 2221 


Al cl Bl | 00 
A2 Al B2| 00 
BI Al B2 01 
B2 AQ (B2) D2 01 
C1 (cl) C2 (cl) D1 | 10 
C2 (c2) BI (2) p2] 1 
Di Cl A2 i 
D2 @ Di ll 


(a) Modified flow table 


Present Next state 
state w2w, = 00 Ol 10 11 Output 
¥3Y2N1 Y3YoY) Z2Z] 


Al 000 100 010 00 
A2 001 (001) 000. «(O11 00 


BI | o10 000 ol 01 
B2| oll 001 i 01 
c1 | 100 110 (100) 101 | 10 
c2| 110 010 (110) 111) 10 
pi | 101 100 001 ll 
p2| 111 110 101 (1) (an) ll 


(b) Excitation table 


Figure 9.59 — Modified flow and excitation tables for Example 9.15. 
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the FSM may be in either of the two equivalent states that represent an original state. Then 
a change to another state must be possible from either of these states. For example, Figure 
9.55a specifies that the FSM must change from the stable state A to state B if the input is 
w2w, = 11. The equivalent transition in the modified flow table is the change from state 
Al to B1 or from state A2 to B2. If the FSM is stable in A and the input changes from 00 
to 10, then a change to C is required. The equivalent transition in the modified flow table 
is from state Al to C1; if the FSM happens to be in state A2, it will first have to change to 
Al. The remaining entries in Figure 9.59a are derived using the same reasoning. 

The outputs are specified using the Moore model, because the only unstable states are 
those involved in changing from one member of the equivalent pair to another, and both 
members generate the same outputs. For instance, in the previously described transition 
from A to C, if the starting point is A2, it is necessary to go first to Al and then to C1. Even 
though A1 is unstable for w2w; = 10, there is no problem because its output is the same as 
that of A2. Therefore, if the original flow table is defined using the Moore model, then the 
modified flow table can also be done using the Moore model. 

Using the assignment of the state variables in Figure 9.58 gives the excitation table in 
Figure 9.59b. 
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9.5.4 One-Hot STATE ASSIGNMENT 


The previously described schemes based on embedding the flow table in a cube may lead 
to an optimal state assignment, but they require a trial-and-error approach that becomes 
awkward for large machines. A straightforward, but more expensive, alternative is to use 
one-hot codes. If each row in the flow table of an FSM is assigned a one-hot code, then 
race-free state transitions can be achieved by passing through unstable states that are at a 
Hamming distance of | from the two stable states involved in the transition. For example, 
suppose that state A is assigned the code 0001 and state B the code 0010. Then a race-free 
transition from A to B can pass through an unstable state 0011. Similarly, if C is assigned 
the code 0100, then a transition from A to C can be done via the unstable state 0101. 

Using this approach, the flow table in Figure 9.55a can be modified as illustrated in 
Figure 9.60. The four states, A, B, C, and D, are assigned one-hot codes. As seen in the 
figure, it is necessary to introduce six unstable states, E through J, to handle the necessary 
transitions. These unstable states have to be specified only for the specific transitions, 
whereas for other input valuations they may be treated as don’t cares. 

The outputs can be specified using the Moore model. In some cases it does not matter 
when a particular output signal changes its value. For instance, state E is used to facilitate 
the transition from state A to C. Since z2z; = 00 in A and 10 in C, it is not important if z2 
changes when passing through state E. 

While straightforward to implement, the one-hot encoding is expensive because it 
requires n state variables to implement an n-row flow table. Simplicity of design and the 
cost of implementation often provide a challenging trade-off in designing logic circuits! 
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State Present Next state Output 
assignment state wow, =00 O01 10 11 222] 
0001 A (A) (A) E F| 00 
0010 B F G 01 
0100 i On] 1) 1 
1000 D 1 1 ®)@} u 
0101 E Se ae | 
o011 F Ke = || Ge 
1010 G a Se ae & || a4 
0110 H me ms. ell 
1100 I C=. = p) i 
1001 J a B® ee: we | 


Figure 9.60 — State assignment with one-hot encoding. 
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(a) Static hazard 
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(b) Dynamic hazard 


Figure 9.61 _ Definition of hazards. 


| 9.6 HAZARDS 


In asynchronous sequential circuits it is important that undesirable glitches on signals should 
not occur. The designer must be aware of the possible sources of glitches and ensure that 
the transitions in a circuit will be glitch free. The glitches caused by the structure of a given 
circuit and propagation delays in the circuit are referred to as hazards. Two types of hazards 
are illustrated in Figure 9.61. 

A Static hazard exists if a signal is supposed to remain at a particular logic value when 
an input variable changes its value, but instead the signal undergoes a momentary change 
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in its required value. As shown in Figure 9.61a, one type of static hazard is when the signal 
at level | is supposed to remain at | but dips to 0 for a short time. Another type is when the 
signal is supposed to remain at level 0 but rises momentarily to 1, thus producing a glitch. 

A different type of hazard may occur when a signal is supposed to change from | to 0 
or from 0 to 1. If such a change involves a short oscillation before the signal settles into its 
new level, as illustrated in Figure 9.61b, then a dynamic hazard is said to exist. 


9.6.1 Static HAZARDS 


Figure 9.62a shows a circuit with a static hazard. Suppose that the circuit is in the state 
where x; = x2 = x3 = 1, in which case f = |. Now let x; change from | to 0. Then the 
circuit is supposed to maintain f = 1. But consider what happens when the propagation 
delays through the gates are taken into account. The change in x, will probably be observed 
at point p before it will be seen at point g because the path from x, to qg has an extra gate 
(NOT) in it. Thus the signal at p will become 0 before the signal at g becomes equal to 1. 
For a short time both p and q will be 0, causing f to drop to 0 before it recovers back to 1. 
This gives rise to the signal depicted on the left side of Figure 9.61a. 
The glitch on f can be prevented as follows. The circuit implements the function 


Sf = XxX. + X13 


The corresponding Karnaugh map is given in Figure 9.62b. The two product terms realize 
the prime implicants encircled in black. The hazard explained above occurs when there is 
a transition from the prime implicant xx to the prime implicant x;x3. The hazard can be 
eliminated by including the third prime implicant, encircled in blue. (This is the consensus 
term, defined in Property 17a in Section 2.5.) Then the function would be implemented as 


f = XxX. +X 1x3 + 2273 


Now the change in x, from | to 0 would have no effect on the output f because the product 
term x2x3 would be equal to | if x. = x3 = 1, regardless of the value of x,. The resulting 
hazard-free circuit is depicted in Figure 9.62c. 

A potential hazard exists wherever two adjacent Is in a Karnaugh map are not covered 
by a single product term. Therefore, a technique for removing hazards is to find a cover 
in which some product term includes each pair of adjacent Is. Then, since a change in an 
input variable causes a transition between two adjacent Is, no glitch can occur because both 
1s are included in a product term. 

In asynchronous sequential circuits a hazard can cause the circuit to change to an 
incorrect stable state. Example 9.16 illustrates this situation. 
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In Example 9.2 we analyzed the circuit that realizes a master-slave D flip-flop. From the 
excitation table in Figure 9.6a, one could attempt to synthesize a minimum-cost circuit that 
realizes the required functions, Y,, and Y,. This would give 
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X4 Pp 
x) 
f 
X3 q 
(a) Circuit with a hazard 
(b) Karnaugh map 
x 
aa 
X3 f 


(c) Hazard-free circuit 


Figure 9.62 = An example of a static-hazard. 


Ym = CD + Cym 
=(CtD)t Ct ym) 
Y, = Cym + Cys 


= (C t Ym) t (C t Ys) 


The corresponding circuit is presented in Figure 9.63a. At first glance this circuit may seem 
more attractive than the flip-flops discussed in Chapter 5 because it is less expensive. The 
problem is that the circuit contains a static hazard. 
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CD CD 
YmYs oo Ol 11 = 10 YmYs oo O1 11 10 


(b) Karnaugh maps for Y,, and Y, in Figure 9.6a 


(c) Hazard-free circuit 


Figure 9.63 — Two-level implementation of master-slave D flip-flop. 
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Figure 9.635 shows the Karnaugh maps for the functions Y,,, and Y,. The minimum-cost 
implementation is based on the prime implicants encircled in black. To see how this circuit 
is affected by static hazards, assume that presently Y, = 1 and C= D=1. The circuit 
generates Y,, = 1. Now let C change from 1 to 0. For the flip-flop to behave properly, Y; 
must remain equal to 1. In Figure 9.63a, when C changes to 0, both p and r become 1. 
Due to the delay through the NOT gate, g may still be 1, causing the circuit to generate 
Yin = Y, = 0. The feedback from Y,, will maintain g = |. Hence the circuit remains in an 
incorrect stable state with Y, = 0. 

To avoid the hazards, it is necessary to also include the terms encircled in blue, which 
gives rise to the expressions 


Yn = CD + Cy, + DyYm 
Y, _ Cv + Cys + Ynys 


The resulting circuit, implemented with NAND gates, is shown in Figure 9.63c. 
Note that we can obtain another NAND-gate implementation by rewriting the expres- 
sions for Y,, and Y, as 


Yn = CD +(C+D)ym 
=(C t D) t (C+D) t yn) 
= (Ct D)t (Ct D) t ym) 
¥, = Cym+ (C + Ym)¥s 
=(C tym) UC TH) tw) 


These expressions correspond exactly to the circuit in Figure 5.12. 


Example 9.17 


Example 9.18 


From the previous examples, it seems that static hazards can be avoided by including all 
prime implicants in a sum-of-products circuit that realizes a given function. This is indeed 
true. But it is not always necessary to include all prime implicants. It is only necessary 
to include product terms that cover the adjacent pairs of 1s. There is no need to cover the 
don’t-care vertices. 

Consider the function in Figure 9.64. Ahazard-free circuit that implements this function 
should include the encircled terms, which gives 


Sf =X 1x3 + X2x3 + 213X4 


The prime implicant x;x, is not needed to prevent hazards, because it would account only 
for the two Is in the left-most column. These Is are already covered by X1x3. 


Static hazards can also occur in other types of circuits. Figure 9.65a depicts a product- 
of-sums circuit that contains a hazard. If x; = x3 = 0 and x2 changes from 0 to 1, then f 
should remain at 0. However, if the signal at p changes earlier than the signal at g, then p 
and q will both be equal to | for a short time, causing a glitchO > | > Oon/f. 

Ina POS circuit, it is the transitions between adjacent Os that may lead to hazards. Thus 
to design a hazard-free circuit, it is necessary to include sum terms that cover all pairs of 
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Giginin 
a el 


Figure 9.64 — Function for Example 9.17. 


adjacent Os. In this example the term in blue in the Karnaugh map must be included, giving 
f = (1 + 2) 2 + x3) (01 + 3) 


The circuit is shown in Figure 9.65c. 
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9.6.2 DYNAMIC HAZARDS 


A dynamic hazard causes glitches on 0 > 1 or 1 > O transitions of an output signal. An 
example is given in Figure 9.66. Assuming that all NAND gates have equal delays, a 
timing diagram can be constructed as shown. The time elapsed between two vertical lines 
corresponds to a gate delay. The output f exhibits a glitch that should be avoided. 

It is interesting to consider the function implemented by this circuit, which is 


f = XX. + X34 + x1 X4 


This is the minimum-cost sum-of-products expression for the function. If implemented in 
this form, the circuit would not have either a static or a dynamic hazard. 

A dynamic hazard is caused by the structure of the circuit, where there exist multiple 
paths for a given signal change to propagate along. If the output signal changes its value 
three times, 0 — | — 0 — 1| in the example, then there must be at least three paths along 
which a change from a primary input can propagate. A circuit that has a dynamic hazard 
must also have a static hazard in some part of it. As seen in Figure 9.66), there is a static 
hazard involving the signal on wire b. 

Dynamic hazards are encountered in multilevel circuits obtained using factoring or 
decomposition techniques, which were discussed in Chapter 8. Such hazards are neither 
easy to detect nor easy to deal with. The designer can avoid dynamic hazards simply by 
using two-level circuits and ensuring that there are no static hazards. 
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vy Pp 
aw) 
f 
X3 q 
(a) Circuit with a hazard 
(b) Karnaugh map 
x) 
2) 
X3 £ 


(c) Hazard-free circuit 


Figure 9.65 = Static hazard in a POS circuit. 


9.6.3 SIGNIFICANCE OF HAZARDS 


A glitch in an asynchronous sequential circuit can cause the circuit to enter an incorrect 
state and possibly become stable in that state. Therefore, the circuitry that generates the 
next-state variables must be hazard free. It is sufficient to eliminate hazards due to changes 
in the value of a single variable because the basic premise in an asynchronous sequential 


9.6 HAZARDS 


(a) Circuit 


= One gate delay 


xy 


X5 Xz, X4 


(b) Timing diagram 


Figure 9.66 —= Circuit with a dynamic hazard. 


circuit is that the values of both the primary inputs and the state variables must change one 
at a time. 

In combinational circuits, discussed in Chapters 3 and 4, we did not worry about 
hazards, because the output of a circuit depends solely on the values of the inputs. In 
synchronous sequential circuits the input signals must be stable within the setup and hold 
times of flip-flops. It does not matter whether glitches occur outside the setup and hold 
times with respect to the clock signal. 
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9.7 A COMPLETE DESIGN EXAMPLE 


In the previous sections we examined the various design aspects of asynchronous sequential 
circuits. In this section we give acomplete design example, which covers all necessary steps. 


9.7.1 THE VENDING-MACHINE CONTROLLER 


The control mechanism of a vending machine is a good vehicle for illustrating a possible 
application of a digital circuit. We used it in the synchronous environment in Chapter 6. A 
small example of a vending machine served as an object of analysis in Section 9.2. Now 
we will consider a vending-machine controller similar to the one in Example 6.7 to see how 
it can be implemented using an asynchronous sequential circuit. The specification for the 
controller is: 


e It accepts nickels and dimes. 
e A total of 15 cents is needed to release the candy from the machine. 


¢ No change is given if 20 cents is deposited. 


Coins are deposited one at a time. The coin-sensing mechanism generates signals 
N = 1 and D = 1 when it sees a nickel or a dime, respectively. It is impossible to have 
N = D= 1 atthe same time. Following the insertion of a coin for which the sum equals 
or exceeds 15 cents, the machine releases the candy and resets to the initial state. 

Figure 9.67 shows a state diagram for the required FSM. It is derived using a straight- 
forward approach in which all possible sequences of depositing nickels and dimes are 
enumerated in a treelike structure. To keep the diagram uncluttered, the labels D and N 
denote the input conditions DN = 10 and DN = 01, respectively. The condition DN = 00 
is labeled simply as 0. The candy is released in states F, H, and K, which are reached after 
15 cents has been deposited, and in states J and L, upon a deposit of 20 cents. 

The corresponding flow table is given in Figure 9.68. It can be reduced using the 
partitioning procedure as follows 


P, = (ADGJ)(BE)(C) (FIL) (HK) 
Pz = (A)(D)(GJ)(B)(E)(C) (FIL) (HK) 
P3 = Py 


Using G to represent the equivalent states G and J, F to represent F, 7, and L, and H to 
represent H and K yields a partially reduced flow table in Figure 9.69. The merger diagram 
for this table is presented in Figure 9.70. It indicates that states C and EF can be merged, as 
well as F and H. Thus the reduced flow table is obtained as shown in Figure 9.71a. The 
same information is depicted in the form of a state diagram in Figure 9.72. 

Next a suitable state assignment must be found. The flow table is relabeled in Fig- 
ure 9.71 to associate a unique number with each stable state. Then the transition diagram 
in Figure 9.73a is obtained. Since we wish to try to embed the diagram onto a three- 
dimensional cube, eight vertices are shown in the figure. The diagram shows two diagonal 
transitions. The transition between D and G (label 7) does not matter, because it is only an 
alternative path. The transition from A to C (label 4) is required, and it can be realized via 
unused states as indicated in blue in Figure 9.73b. Therefore, the transition diagram can be 
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Figure 9.67 _ Initial state diagram for the vending-machine controller. 


embedded onto a three-dimensional cube as shown. Using the state assignment from this 
figure, the excitation table in Figure 9.74 is derived. 

The Karnaugh maps for the next-state functions are given in Figure 9.75. From these 
maps the following hazard-free expressions are obtained 


Y, = Ny, + Ny; + Dy, + yiy3 + y1y2 
Y2 = Ny, + Ny2 + yy, y3 + Dy2y3 + Dyrys 
Y3 = Dy, + y2y3 + Nyiy2 + Dy3N 


All product terms in these expressions are needed for a minimum-cost POS implementation 
except for y;y2, which is included to prevent hazards in the expression for Y;. The output 
expression is 


Z=Y1VoV3 
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Present Next state Output 
stat@é | DN=00 O1 10 11 z 
A (A) B Cc - 0 
B D (B) - - 0 
Cc f= («= 0 
D (D) BE F - 0 
E GH) = = 0 
F A - (® - 1 
G () HI - 0 
H A@ - - 1 
I A - () - 1 
J G) kK L - 0 
K A« - - 1 
L A - @ - 1 


Figure 9.68 _ Initial flow table for the vending-machine controller. 


Present Next state Output 
state = 00 10 11 Zz 
A Or: aC = 0 
B D C 7 0 
C G © - 0 
D (DP) B F - 0 
E G op : 0 
F A (F) - 1 
G (G) H F - 0 
H A) - - 1 


Figure 9.69 _ First step in state minimization. 
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A B C 
° ° 
H eD 
° 
G F E 


Figure 9.70 —=— Merger diagram for Figure 9.69. 


Present Next state Output 
state DN= 


A 


Qno.a Bw 
@)>Qavseys 
Qa 
Les 
| 


(a) Minimized flow table 


Present Next state Output 
state | pN=00 O1 10 11 z 


A @ 
5 
8 


Qn ovuUaQ wb 
w 
x 
| 


(b) Relabeled flow table 


Figure 9.71 —_ Reduced flow tables. 
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Figure 9.72 — State diagram for the vending-machine controller. 


B 5 D 110 4 Cc 
Ne |, ee" 
B 
8 
2 7 2 
100 G 
4 101 
6,7 
A 1 F 
e : e 000 001 
(a) Transition diagram (b) Embedded on the cube 


Figure 9.73 —_ Determination of the state assignment. 
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Present Next state 
state DN=00 Ol 10. 141 | Output 
3 Y2Y1 ¥3Yo¥1 z 


A 000 (000) 010 100 - 0 


B 010 O11 - - 0 
C 111 101 (a) (in) = 0 
D O11 111 O01 - 0 
F 001 000 - 1 


c| 101 (101) 001 O01 - 0 


100 - - 110. - 0 


110 - = lll - 0 


Figure 9.74 — Excitation table based on the state assignment in 
Figure 9.73b. 
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9.8 CONCLUDING REMARKS 


Asynchronous sequential circuits are more difficult to design than the synchronous sequen- 
tial circuits. The difficulties with race conditions present a problem that must be handled 
carefully. At the present time there is littke CAD support for designing asynchronous cir- 
cuits. For these reasons, most designers resort to synchronous sequential circuits in practical 
applications. 

An important advantage of asynchronous circuits is their speed of operation. Since 
there is no clock involved, the speed of operation depends only on the propagation delays 
in the circuit. In an asynchronous system that comprises several circuits, some circuits may 
operate faster than others, thus potentially improving the overall performance of the system. 
In contrast, in synchronous systems the clock period has to be long enough to accommodate 
the slowest circuit, and it has a large effect on the performance. 

Asynchronous circuit techniques are also useful in designing systems that consist of 
two or more synchronous circuits that operate under the control of different clocks. The 
signals exchanged between such circuits often appear to be asynchronous in nature. 

From the reader’s point of view, itis useful to view asynchronous circuits as an excellent 
vehicle for gaining a deeper understanding of the operation of digital circuits in general. 
These circuits illustrate the consequences of propagation delays and race conditions that 
may be inherent in the structure of a circuit. They also illustrate the concept of stability, 
demonstrated through the existence of stable and unstable states. For further discussion of 
asynchronous sequential circuits, the reader may consult references [1-6]. 
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01 11 10 


Y192 00 Ol 11 10 ¥192 00 Ol 11 10 


¥1¥2 00 O01 11 10 Y1¥2 


yz = 1 


(c) Map for Y3 


Figure 9.75 — Karnaugh maps for the functions in Figure 9.74. 
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9.9 EXAMPLES OF SOLVED PROBLEMS 


This section presents some typical problems that the reader may encounter, and shows how 
such problems can be solved. 


Problem: Derive a flow table that describes the behavior of the circuit in Figure 9.76. Example 9.19 
Solution: Modelling the propagation delay in the gates of the circuit as shown in Figure 9.8, 
the circuit in Figure 9.76 can be described by the following next-state and output expressions 
Y, = wiW2 + Woy) + Wiyiy2 
Y2=w2+wiyi + wiy2 
<= Y2 


These expressions lead to the excitation table in Figure 9.77a. Assuming the state assign- 
ment A = 00, B = 01, C = 10, and D = 11, yields the flow table in Figure 9.77b. 


Figure 9.76 — Circuit for Example 9.19. 


624 CHAPTER 9 © ASYNCHRONOUS SEQUENTIAL CIRCUITS 


Present Next state 
state w2w 1 = 00 01 10 11 Output 
y2Y1 YoY, mY BY we Zz 


00 ol 10 10 0 
01 ul 10-10 0 
10 00 61 1 
ll (1) @) @) 10 1 


(a) Excitation table 


Present Next state Output 
state | wow, = 00 O1 10 II z 
A (A) B c Cc 0 


B D(B) Cc cl o 
Cc A D(C©(C©|] 1 
D (D)®) ®) c 1 


(b) Flow table implemented by the circuit 


Present Next state 


state | wow,=00 O1 10 11 “ 


A (A) B c Cc] o 
B 
Cc ‘A (Oe), 4 
D 


(c) Final flow table 


Figure 9.77 Excitation and flow tables for the circuit in Figure 9.76. 


9.9 EXAMPLES OF SOLVED PROBLEMS 


Since in a given stable state, inputs to the circuit can only change one at a time, 
some entries in the flow table may be designated as unspecified. Such is the case when 
the circuit is stable in state B and input values are w2w; = 01. Now, both inputs cannot 
change simultaneously, which means that the corresponding entry in the flow table should 
be designated as unspecified. However, a different situation arises when the circuit is stable 
in state A and input values are w.w, = 00. In this case, we cannot indicate the transition 
in column ww; = 11 as unspecified. The reason is that if the circuit is in stable state B, 
it has to be able to change to state C when w2 changes from 0 to 1. States B and C are 
implemented as y2y; = 01 and y2y; = 10, respectively. Since both state variables must 
change their values, the route from 01 to 10 will take place either via 11 or 00, depending 
on the delays in different paths in the circuit. If yz changes first, the circuit will pass through 
unstable state D and then settle in the stable state C. But, if w; changes first, the circuit 
has to pass through unstable state A before reaching state C. Hence, the transition to state 
C in the first row must be specified. This is an example of a safe race, where the circuit 
reaches the correct destination state regardless of propagation delays in different paths of 
the circuit. The final flow table is presented in Figure 9.77c. 
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Problem: Are there any hazards in the circuit in Figure 9.76? 


Solution: Figure 9.78 gives Karnaugh maps for the next-state expressions derived in Ex- 
ample 9.19. As seen from the maps, all prime implicants are included in the expression for 
Y,. But, the expression for Y> includes only three of the four available prime implicants. 
There is a static hazard when w2y2y; = O11 and w; changes from 0 to | (or | to 0). This 
hazard can be removed by adding the fourth prime implicant, y; y2, to the expression for Yo. 


Figure 9.78 — Karnaugh maps for the circuit in Figure 9.76. 


Example 9.20 
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Example 9.21 


Problem: A circuit has an input w and an output z. A sequence of pulses is applied on input 
w. The output has to replicate every second pulse, as illustrated in Figure 9.79. Design a 
suitable circuit. 


Solution: Figure 9.80 shows a possible state diagram and the corresponding flow table. 
Compare this with the FSM defined in Example 9.4 in Figure 9.13, which specifies a serial 
parity generator. The only difference is in the output signal. In our case, z = 1 only in 
state B. Therefore, the next-state expressions are the same as in Example 9.4. The output 
expression is 


Z=yiy2 


Figure 9.79 ~— Waveforms for Example 9.21. 


(a) State diagram 


Present Next state Output 


Zz 


state =O 


A 0 


w=1 
(a) B 

B Cc 1 
C (C) D 
D A (D) 


(b) Flow table 


Figure 9.80 = State diagram and flow table for Example 9.21. 
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Present Next state Output 


state | wow,=00 O1 10 


A @) 


C 
E H 
- © 


oo So 


2'@=e7@' = 


zZ7Toa 7 mMUAQ SB 


Figure 9.81 Flow table for Example 9.22. 


627 


Problem: Consider the flow table in Figure 9.81. Reduce this flow table and find a state 


assignment that allows this FSM to be realized as simply as possible, preserving the Moore 
model. Derive an excitation table. 


Solution: Using the partioning procedure on the flow table in Figure 9.81 gives 
P, = (ACEFG)(BDH) 
Pz = (AG)(B)(C)(D)(E) (F) (A) 
P3 =P» 


Combining A and G produces the flow table in Figure 9.82. A merger diagram for this table 
is shown in Figure 9.83. Merging the states (A, E), (C, F), and (D, H) leads to the reduced 
flow table in Figure 9.84. To find a good state assignment, we relabel this flow table as 
indicated in Figure 9.85, and construct the transition diagram in Figure 9.86a. The only 
problem in this diagram is the transition from state D to state A, labeled as 1. Achange from 
D to A can be made via state C if we specify so in the flow table. Then, a direct transition 
from D to A is not needed, as depicted in Figure 9.86b. The resulting flow table and the 
corresponding excitation table are shown in Figure 9.87. 


Example 9.22 


Problem: Derive a hazard-free minimum-cost SOP implementation for the function 


F iy 2845) = by m(2, 3, 14, 17, 19, 25, 26, 30) + D(10, 23, 27, 31) 


Solution: The Karnaugh map for the function is given in Figure 9.88. From it, the required 
expression is derived as 


f = 1 X3X5 + XQX4X5 +X X2N3Xq + X2XZX4X5 


The first three product terms cover all 1s in the map. The fourth term is needed to avoid 
having a hazard when x2x3x4x5 = 0011 and x; changes from 0 to | (or 1 to 0). Thus, each 
pair of adjacent Is is covered by some prime implicant in the expression. 


Example 9.23 
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Present Next state Output 
state wow; =00 O1 10 11 . 
A (A) BE Cc - 0 
B - E H : 
Cc A - (c) F 0 
D A (D) - B 1 
E A (E) = 8B 0 
B - Dc @®] 0 
A - (@ B 1 


Figure 9.82 — Reduction after the partitioning procedure. 


E F H 


Figure 9.83 = Merger diagram for the flow table in Figure 9.82. 


Present Next state Output 


state | wow, = 00 O01 10 11 z 


A (A)(A) Cc BI O 
B - A D @)} 1 
a A D(@©@©| o 
D A () @) B 1 


Figure 9.84 — Reduced flow table for the FSM in Figure 9.82. 
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Present Next state Output 
state | wow, = 00 O01 10 11 z 
A QQ) 4 3 0 
B a 2 7 a 4 
C i: i. (4) Gy) ® 
D 1@@ 3 i 


Figure 9.85 


C=10 Lé D=ll 
1% 
A=00 2,3 B=01 


(a) Initial transition diagram 


C=10 iLé D=ll 
1,4 3,7 
A=00 2,3 B=01 


(b) Augmented transition diagram 


Figure 9.86 Transition diagrams for Figure 9.85. 


Relabeled flow table of Figure 9.84. 
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Present Next state Output 
state | wow,=00 Ol 10 II Zz 
A (A) (A) c B] o 
B - A D 1 
‘ rP©©) ° 
D C® (@) B| 1 
(a) Final flow table 
Present Next state 
state w2w, = 00 01 10 11 Output 
a7 YoY, YX, YoY, YoY z 


00 10 Ol 0 
01 - 0 1 
10 00 11 0 
ll 1 =8=l) Go 1 


(b) Excitation table 


Figure 9.87 — Excitation and flow tables for Example 9.22. 


Figure 9.88 — Karnaugh map for Example 9.23. 
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PROBLEMS 


Answers to problems marked by an asterisk are given at the back of the book. 


*9.1 Derive a flow table that describes the behavior of the circuit in Figure P9.1. Compare your 
solution with the tables in Figure 9.21. Is there any similarity? 


al 


cat 


2 


W, 


Figure P9.1 Circuit for Problem 9.1. 


9.2 Consider the circuit in Figure P9.2. Draw the waveforms for the signals C, z,, and Z2. 
Assume that C is a square-wave clock signal and that each gate has a propagation delay 
A. Express the behavior of the circuit in the form of a flow table that would produce the 
desired signals. (Hint: Use the Mealy model.) 


Cc 


Figure P9.2 — Circuit for Problem 9.2. 


9.3 Derive the minimal flow table that specifies the same functional behavior as the flow table 
in Figure P9.3. 


9.4 Derive the minimal Moore-type flow table that specifies the same functional behavior as 
the flow table in Figure P9.4. 
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9.5 


9.6 


9.7 


*9.8 
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Present Next state Output 
state | wow; = 00 O1 10 It z 
A G) Be @ 0 
B D Se 0 
c BS (@) <2 0 
D (D) BE F - 0 
E .¢) 0 
F M - (®) - 0 
G () H I - 0 
H 1@- - 0 
I AS TE 1 
J oe = 0 
K As = 1 
i A = CL) = 1 
M (M) NO - 0 
N A). = = 1 
O A - (0) - 1 
P Gy) es = 0 
R PRS: = 0 
S A =. TS) S 1 
a @) uv - 0 
U A@ - - 1 
Vv A= 1 


Figure P9.3. Flow table for Problem 9.3. 


Find a suitable state assignment using as few states as possible and derive the next-state 
and output expressions for the flow table in Figure 9.42. 


Find a suitable state assignment for the flow table in Figure 9.42, using pairs of equivalent 
states, as explained in Example 9.15. Derive the next-state and output expressions. 


Find a state assignment for the flow table in Figure 9.42, using one-hot encoding. Derive 
the next-state and output expressions. 


Implement the FSM specified in Figure 9.39, using the merger diagram in Figure 9.40a. 
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Present Next state Output 
state | wow;=00 O01 10 11 ‘ 
A (A) B Cc - 0 
B K (B) - 4H 0 
fe f= (C) Mi © 
D (DP) E J - 1 
E A () - M| o 
F @ Lui - 0 
G DG) - H| 0a 
H =. € f @) 4 
J F - G) H| o 
K ‘ob Cs 1 
L AQ) - 4H 0 
M - G c @™| 1 


Figure P9.4 Flow table for Problem 9.4. 


9.9 Find a suitable state assignment for the FSM defined by the flow table in Figure P9.5. 
Derive the next-state and output expressions for the FSM using this state assignment. 


Present Next state Output 
state | wow, = 00 O1 10 II z 
A (A) B Cc - 


= © 


E 

A) - G| 0 
E 0 
B 


© 
| 
Qa a 
ooo 


Cc «= 
c @). 4 


Figure P9.5 —_ Flow table for Problem 9.9. 


Q Ht OO am oa 
()nm ov 


*9.10 ‘Find a hazard-free minimum-cost implementation of the function 


ft, ---.44) = )>m(0, 4, 11, 13, 15) + DQ, 3, 5, 10) 
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9.11 


*9.12 


9.13 


*9.14 


9.15 
9.16 


*9.17 


9.18 


9.19 
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Repeat Problem 9.10 for the function 


fis. 5X5) = > mo, 4,5, 24, 25, 29) + D(8, 13, 16, 21) 


Find a hazard-free minimum-cost POS implementation of the function 
f(a, ...,x4) = 1M (0, 2, 3,7, 10) + D(S, 13, 15) 
Repeat Problem 9.12 for the function 


f(x, ...,x5) = TIM (2, 6, 7, 25, 28, 29) + D(O, 8, 9, 10, 11, 21, 24, 26, 27, 30) 
Consider the circuit in Figure P9.6. Does this circuit exhibit any hazards? 


A 
B 


E 


Figure P9.6 — Circuit for Problem 9.14. 


Design an original circuit that exhibits a dynamic hazard. 


A control mechanism for a vending machine accepts nickels and dimes. It dispenses mer- 
chandise when 20 cents is deposited; it does not give change if 25 cents is deposited. 
Design the FSM that implements the required control, using as few states as possible. Find 
a suitable state assignment and derive the next-state and output expressions. 


Design an asynchronous circuit that meets the following specifications. The circuit has two 
inputs: a clock input c and acontrol input w. The output, z, replicates the clock pulses when 
w = |; otherwise, z = 0. The pulses appearing on z must be full pulses. Consequently, if 
c = | when w changes from 0 to 1, then the circuit will not produce a partial pulse on z, but 
will wait until the next clock pulse to generate z = 1. If c = 1 when w changes from | to 0, 
then a full pulse must be generated; that is, z = 1 as long as c = 1. Figure P9.7 illustrates 
the desired operation. 


Repeat Problem 9.17 but with the following change in the specification. While w = 1, the 
output z should have only one pulse; if several pulses occur on c, only the first one should 
be reproduced on z. 


Example 9.6 describes a simple arbiter for two devices contending for a shared resource. 
Design a similar arbiter for three devices that use a shared resource. In case of simultaneous 
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Figure P9.7 = Waveforms for Problem 9.17. 


requests, namely, if one device has been granted access to the shared resource and before it 
releases its request the other two devices make requests of their own, let the priority of the 
devices be Device | > Device 2 > Device 3. 


9.20 In the discussion of Example 9.6, we mentioned a possible use of the mutual exclusion 
element (ME) to prevent both request inputs to the FSM being equal to | at the same time. 
Design an arbiter circuit for this case. 


9.21 In Example 9.21 we designed a circuit that replicates every second pulse on input w as a 
pulse on output z. Design a similar circuit that replicates every third pulse. 


9.22 In Example 9.22 we merged states D and H to implement the FSM in Figure 9.82. An 
alternative was to merge states B and H, according to the merger diagram in Figure 9.83. 
Derive an implementation using this choice. Derive the resulting excitation table. 
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COMPUTER AIDED DESIGN TOOLS 


CHAPTER OBJECTIVES 


In this chapter you will learn how CAD tools can be used to design 
and implement a logic circuit. The discussion deals with the synthesis 
and physical design stages in a typical CAD system, including 


Netlist extraction 
Technology mapping 
Placement 

Routing 


Static timing analysis 
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We introduced CAD tools in Section 2.9, and have discussed them briefly in other chapters. The word tool 
in this context means a software program that allows a user to perform a particular task. In this chapter we 
describe some of the tools in a typical CAD system in more detail, by showing how a small design example 
is processed and optimized as it passes through different stages in the CAD flow. 


| 10.1 SYNTHESIS 


Figure 10.1, which is reproduced from Figure 2.35, gives an overview of a CAD system. A 
description of the desired circuit is prepared, usually in the form of a hardware description 
language like Verilog HDL. The Verilog code is then processed by the synthesis stage 
of the CAD system. Synthesis is the process of generating a logic circuit from the user’s 
specification. Figure 10.2 shows three typical phases that are found in the synthesis process. 


10.1.1 NETLIST GENERATION 


The netlist generation phase checks the syntax of the code, and reports any errors such as 
undefined signals, missing parentheses, and wrong keywords. Once all errors are fixed a 
circuit netlist is generated as determined by the semantics of the Verilog code. The netlist 
uses logic expressions to describe the circuit, and includes components such as adders, 
flip-flops, and finite state machines. 


10.1.2 GATE OPTIMIZATION 


The next phase is gate optimization, which performs the kinds of logic optimizations de- 
scribed in the preceding chapters. These optimizations manipulate the netlist to obtain an 
equivalent, but better circuit according to the optimization goals. As we said previously, 
the measurement of what makes one circuit better than another may be based on the cost 
of the circuit, its speed of operation, or a combination of both. 

As an example of results produced by the synthesis phases discussed so far, consider 
the Verilog code for the addersubtractor module in Figure 10.3, which specifies a circuit 
that can add or subtract n-bit numbers and accumulate the result in a register. From this 
code the synthesis tool produces a netlist that corresponds to the circuit in Figure 10.4. The 
input numbers, A = ay_-1,..., a9 and B = b,_\,..., bo, are placed into registers Areg and 
Breg prior to being used in addition or subtraction operations. These registers synchronize 
the operation of the circuit if A and B are externally provided asynchronous inputs. The 
control input Se/ determines the mode of operation. If Se/ = 0, then A is selected as an input 
to the adder; if Sel = 1, then the result register Zreg is selected. The control input AddSub 
determines whether the operation is addition or subtraction. The flip-flops in Figure 10.4 for 
registers A, B, Sel, AddSub, and Overflow are inferred from the code at the bottom of Figure 
10.3a. Multiplexers are produced from the mux2tol module, and an adder is generated 
from the adderk module in Figure 10.3b. The exclusive-OR gates connected to register B, 
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Synthesis 


Netlist Generation 


Technology Mapping 


Figure 10.2 The stages included in a synthesis tool. 


and the XOR function for the Overflow output are generated from the code near the middle 
of the addersubtractor module. 


10.1.3 TECHNOLOGY MAPPING 


The final phase of synthesis is technology mapping. This phase determines how each 
component in the netlist can be realized in the resources available in the target chip, such 
as an FPGA. A logic block for a typical FPGA is depicted in Figure 10.5a. It contains a 
four-input LUT and a flip-flop, and has two outputs. A multiplexer is provided to allow 
loading of the flip-flop from the LUT or directly from input /n3. Another multiplexer allows 
the stored value in the flip-flop to be fed back to one input of the LUT. There are a number of 
different ways, or modes, in which this logic block can be used. The most straightforward 
choice is to implement a function of up to four inputs in the LUT, and store this function’s 
value in the flip-flop; both the LUT and flip-flop can provide outputs from the logic block. 
Parts b to e of the figure illustrate four other modes of using the block. In parts b and c only 
the LUT or the flip-flop is used, but not both. In part d only the LUT provides an output of 
the logic block, and one of the LUT’s inputs is connected to the flip-flop. 

FPGAs often contain dedicated circuitry for implementation of fast adder circuits. 
Figure 10.5e shows one way in which such circuitry can be realized. The LUT is used in 
two halves, where one half produces the sum function of three LUT inputs and the other half 
produces the carry function of these inputs. The sum function can provide an output of the 
block or be stored in the flip-flop, and the carry function provides a special output signal. 
This carry output connects directly to a neighboring logic block that uses it as a carry input. 


10.1. SYNTHESIS 


// Top-level module 

module addersubtractor (A, B, Clock, Reset, Sel, AddSub, Z, Overflow); 
parameter n = 16; 
input [n—1:0] A, B; 
input Clock, Reset, Sel, AddSub; 
output [n—1:0] Z; 
output Overflow; 
reg SelR, AddSubR, Overflow; 
reg [n—1:0] Areg, Breg, Zreg; 
wire [n—1:0] G H, M, Z; 
wire carryout, over_flow; 


// Define combinational logic circuit 

assign H = Breg “ {n{ AddSubR} }; 

mux2tol multiplexer (Areg, Z, SelR, G); 
defparam multiplexer.k = n; 

adderk nbit_adder (AddSubR, G, H, M, carryout); 
defparam nbit_adder.k = n; 

assign over_flow = carryout “ G[n—1] “ H[n—1] * M[n-1]; 

assign Z = Zreg; 


// Define flip-flops and registers 
always @(posedge Reset or posedge Clock) 
if (Reset == 1) 
begin 
Areg <= 0; Breg <= 0; Zreg <= 0; 
SelR <= 0; AddSubR <= 0; Overflow <= 0; 
end 
else 
begin 
Areg <= A; Breg <= B; Zreg <= M; 
SelR <= Sel; AddSubR <= AddSub; Overflow <= over_flow; 
end 
endmodule 
... continued in Part b 


Figure 10.3 Verilog code for an accumulator circuit (Part a). 


This block in turn generates the next stage of carry output, and so on. In this way, direct 
connections between neighboring logic blocks are used to form fast carry chains. 

Figure 10.6 shows a part of the results of technology mapping for the netlist generated 
for Figure 10.4. Each logic block is highlighted with a blue square, and has a label on the 
lower left corner that indicates which mode in Figure 10.5 is being used. The figure shows 
bit io from Figure 10.4, which is produced by a logic block in mode d. This block uses 
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// k-bit 2-to-1 multiplexer 
module mux2tol (V, W, Selm, F); 
parameter k = 8; 
input [k—1:0] V, W; 
input Selm; 
output [k—1:0] F; 
reg [k—1:0] F; 


always @(V or W or Selm) 
if (Selm == 0) F=V; 
else F=W; 
endmodule 


// k-bit adder 
module adderk (carryin, X, Y, S, carryout); 
parameter k = 8; 
input [k—1:0] X, Y; 
input carryin; 
output [k—1:0] S; 
output carryout; 
reg [k—1:0] S; 
reg carryout; 


always @(X or Y or carryin) 
{carryout, S} = X + Y + carryin; 
endmodule 


Figure 10.3 Verilog code for an accumulator circuit (Part b). 


a flip-flop to store the value of primary input bo, and implements an XOR function in its 
LUT, which is needed in subtraction operations to complement the number B. One input of 
the XOR is provided by the logic block in mode c that stores in a flip-flop the value of the 
AddSub input. This flip-flop also drives 15 other logic blocks that implement fy, ..., Ay45, 
but these blocks are not shown in the figure. 

The AddSub flip-flop is connected to the carry-in of the first logic block in the adder. 
This block uses mode e to produce sum and carry outputs. The sum is stored in a flip-flop 
that produces Zo, and the carry feeds the next stage of the adder. The figure shows the carry 
function in the form 


C1 = (co B ho) - ho + (co @ ho) - go 


This expression is functionally equivalent to the one used in Chapter 3, which has the form 
C1 = coho + cogo + hogo, but it represents more closely how the carry chain is built in an 
FPGA. The last logic block of the adder in Figure 10.6 does not use its flip-flop, because 
the sum output has to be connected directly to the logic block that implements the Overflow 
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do Sel B= n-1 bo AddSub 


b 
| alti | | | oie | | 


breg, 1 


AddSubR 


carryout n-bit adder carryin 


Overflow Z=Z 


n—-1 20 


Figure 10.4 — Circuit specified by the code in Figure 10.3. 


signal. The sum output cannot be provided from both the combinational and registered 
outputs concurrently, so a separate logic block in mode c is needed for the z;5 signal. 

Figure 10.6 shows only a few of the logic blocks that a technology mapping tool would 
create for implementing our circuit. In general, there are many different ways in which 
technology mapping can be done, and each method will lead to equivalent, but different 
circuits. The reader can consult references [1—3] for a detailed discussion of technology 
mapping approaches. 
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(a) An FPGA logic element. 


oat elAy 


(b) Combinational mode (c) Synchronous mode 
(d) Synchronous feedback mode (e) Arithmetic (synchronous) mode 


Figure 10.5 _ Different modes of an FPGA logic block. 


| 10.2 PHYSICAL DESIGN 


The next stages following synthesis in Figure 10.1 are functional simulation and physical 
design. As we said in Section 2.9, functional simulation involves applying test patterns to 
the synthesized netlist and checking to see if it produces the correct outputs. The simulation 
assumes that there are no propagation delays in the circuit, because the intent is to evaluate 
the basic functionality rather than timing. The netlist used by a functional simulator could 
be either the version before technology mapping, or after. 

Once the netlist produced by synthesis is functionally correct, the physical design 
stage can be performed. This stage determines exactly how the synthesized netlist will be 
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Physical Design 


Static Timing Analysis 


Figure 10.7 Phases in physical design. 


implemented in the target chip. As illustrated in Figure 10.7, three phases are involved: 
placement, routing, and static timing analysis. 


10.2.1 PLACEMENT 


The placement phase chooses a location on the target device for each logic block in the 
technology-mapped netlist. An example of a placement result is given in Figure 10.8. It 
shows an array of logic blocks in a small portion of an FPGA chip. The white squares 
represent unoccupied blocks and the grey squares show the placement of blocks that imple- 
ment the circuit of Figure 10.4. There is a total of 53 logic blocks in this circuit, including 
the ones shown in Figure 10.6. Also shown in Figure 10.8 is the placement of some of the 
primary inputs to the circuit, which are assigned to pins around the chip periphery. 

To find a good placement solution a number of different locations have to be considered 
for each logic block. For a large circuit, which may contain tens of thousands of blocks, 
this is a hard problem to solve. To appreciate the complexity involved, consider how many 
different placement solutions are possible for a given circuit. Assume that the circuit has 
N logic blocks, and it is to be placed in an FPGA that also contains exactly N blocks. A 
placement tool has N choices for the location of the first block that it selects. There remain 
N — 1 choices for the second block, N — 2 choices for the third, and so on. Multiplying 
these choices gives a total of (V)(NV — 1)--- (1) =M! possible placement solutions. For 
even moderate values, NV! is a huge number, which means that heuristic techniques must be 
used to find a good solution while considering only a small fraction of the total number of 
choices. A typical commercial placement tool operates by constructing an initial placement 
configuration and then moving logic blocks to different locations in an iterative manner. 
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Figure 10.8 — Placement of the circuit in Figure 10.6. 


For each iteration the quality of the solution is assessed by using metrics that estimate the 
speed of operation of the implemented circuit, or its cost. The placement problem has been 
studied extensively and is described in detail in references [4-7]. 


10.2.2 ROUTING 


Once a location in the chip is chosen for each logic block in a circuit, the routing phase 
connects the blocks together by using the wires that exist in the chip. An example of a 
routing solution for the placement in Figure 10.8 is given in Figure 10.9. In addition to 
showing the logic blocks, this figure also displays some of the wires in the chip. Wires 
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Figure 10.9 Routing for the placement in Figure 10.8. 


that are being used by the implemented circuit are shaded in grey. The figure depicts both 
individual wires, which may be of various lengths, and bundles of wires, which are shown as 
grey rectangles. The routing CAD tool tries to make the best use of various kinds of wires, 
such as efficient connections for carry chains. Figure 10.9 shows an example of the carry 
chain path from Figure 10.6. Black lines highlight the carry chain wires, which connect 
through the stages of the adder, ending at the Overflow register. A detailed discussion of 
routing tools can be found in references [3], [5—6], and [8]. 


10.2.3 Static TIMING ANALYSIS 


After routing is complete the timing delays for the implemented circuit are known, because 
the CAD system computes the timing delays of all blocks and wires in the chip. A static 
timing analysis tool examines this delay information and produces a set of tables that 
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quantify the circuit’s performance. An example of a timing analysis result is given in 
Table 10.1, which lists four parameters: Fina, tsu, teo, and ty. The Fina, Value specifies the 
maximum operating frequency of the circuit’s clock. This value is determined by the path 
with the longest propagation delay, often called the critical path, between any two flip-flops 
in the circuit. As shown in Sections 5.15 and 7.8, the path delay must account for the delays 
through logic blocks and wires, as well as the flip-flop clock-to-Q delay (¢,q) and setup (f5,,) 
parameters. In our example the critical path delay is 1/261.1 x 10° = 3.83 ns. The last 
two columns in the Fj, row show that the path starts at the AddSub flip-flop and ends at 
the Overflow flip-flop shown in Figure 10.6. 

Most CAD systems allow users to specify the timing constraints for their circuit. In 
Table 10.1 we assume the user has specified that the circuit clock has to operate correctly 
up to a frequency of 200 MHz. The difference between this requirement and the result that 
is obtained by the CAD tools is referred to as slack. In the table, the requirement is that 
the propagation delays must not exceed 1/200 x 10° = 5 ns; the result is 3.83 ns, which 
gives a slack value of 1.17 ns. This positive slack means that the constraints have been 
met with some room to spare. If the obtained result had a negative slack, then the user’s 
requirements would not have been met, and it would be necessary to modify the Verilog 
code or settings used in the CAD tool to try to meet the constraints. 

The other rows in Table 10.1 show the timing results for the design’s primary inputs 
and outputs. The ¢,, result indicates the worst-case setup requirement is 2.356 ns, from 
pin bo to flip-flop brego. This parameter means that the bp signal must have a stable value 
at least 2.356 ns before each active edge of the clock signal at its assigned pin. Since the 
designer specified a worst-case setup constraint of 10 ns, the obtained result means that the 
implemented circuit exceeds (meets) the requirement by a slack value of 7.644 ns. The 
worst-case clock-to-output delay for our circuit is 6.772 ns, from flip-flop zrego to pin Zo. 
This means that the propagation delay from an active edge of the clock signal at its pin to a 
corresponding change in the zo signal at its pin is 6.772 ns. Since the designer’s constraint 
specifies that a 10 ns f,, is allowed, the available slack is 3.228 ns. 

The last row in Table 10.1 gives a maximum hold time of 0.24 ns, for the path from 
pin b; to flip-flop breg;. Hence, the signal at pin b; must maintain a stable value for at least 
0.24 ns after each active edge at the clock pin. We assume that no constraint was set for 
this parameter, thus no slack value is shown. 


Table 10.1. =A summary of static timing analysis results. 


Parameter Actual Required Slack From To 

|e 261.1 MHz 200 MHz 1.17 ns AddSub Overflow 
toy 2.356 ns 10.0 ns 7.644 ns bo brego 

ite 6.772 ns 10.0 ns 3.228 ns Zrego Z0 

th 0.240 ns N/A N/A by breg 
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Table 10.1 lists only the worst-case paths for Finax, tsu, teo, and t,. The implemented 
circuit will have a number of other paths that have smaller delays and greater slack values. 
A static timing analysis tool typically provides additional tables for each parameter, which 
list more paths. 

The final stage of the CAD flow in Figure 10.1 is timing simulation. In this stage both 
the circuit’s functional and timing behavior can be simulated. When performing timing 
simulation the CAD tools include estimates of timing delays for all logic elements, routing 
wires, and other resources that exist in the target device. 


10.3 CONCLUDING REMARKS 


In this chapter we explained briefly a typical design flow made possible by the existence of 
powerful CAD tools. We considered only the most important subset of the tools available 
in commercial CAD systems. To learn more the reader can consult references [1-8], or 
visit the web sites of CAD tool suppliers. Table 10.2 lists some of the major vendors of 
CAD tools, and shows their web addresses and the names of some popular products. 


Table 10.2 = Major CAD tool products. 


Vendor Name WWW Locator Product Names 
Altera altera.com Quartus IT 

Mentor Graphics mentorgraphics.com ModelSim, Precision 
Synopsys synopsys.com Design Compiler, VCS 
Xilinx xilinx.com ISE, Vivado 
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TESTING OF LOGIC CIRCUITS 


CHAPTER OBJECTIVES 


In this chapter you will be introduced to: 

e Various techniques for testing of digital circuits 

e Representation of typical faults in a circuit 

e Derivation of tests needed to test the behavior of a circuit 
e Design of circuits for easy testability 
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In the previous chapters we have discussed the design of logic circuits. Following a sound design procedure, 
we expect that the designed circuit will perform as required. But how does one verify that the final circuit 
indeed achieves the design objectives? It is essential to ascertain that the circuit exhibits the required functional 
behavior and that it meets any timing constraints that are imposed on the design. We have discussed the timing 
issues in several places in the book. In this chapter we will discuss some testing techniques that can be used 
to verify the functionality of a given circuit. 

There are several reasons for testing a logic circuit. When the circuit is first developed, it is necessary 
to verify that the designed circuit meets the required functional and timing specifications. When multiple 
copies of a correctly designed circuit are being manufactured, it is essential to test each copy to ensure that 
the manufacturing process has not introduced any flaws. It is also necessary to test circuits used in equipment 
that is installed in the field when it is suspected that there may be something wrong. 

The basis of all testing techniques is to apply predefined sets of inputs, called fests, to a circuit and 
compare the outputs observed with the patterns that a correctly functioning circuit is supposed to produce. 
The challenge is to derive a relatively small number of tests that provide an adequate indication that the circuit 
is correct. The exhaustive approach of applying all possible tests is impractical for large circuits because 
there are too many possible tests. 


| T1.1 FAULT MODEL 


A circuit functions incorrectly when there is something wrong with it, such as a transistor 
fault or an interconnection wiring fault. Many things can go wrong, leading to a variety of 
faults. A transistor switch can break so that it is permanently either closed or open. A wire 
in the circuit can be shorted to Vpp or to ground, or it can be simply broken. There can be an 
unwanted connection between two wires. A logic gate may generate a wrong output signal 
because of a fault in the circuitry that implements the gate. Dealing with many different 
types of faults is cumbersome. Fortunately, it is possible to restrict the testing process to 
some simple faults, and obtain generally satisfactory results. 


11.1.1 Struck-aT MoDEL 


Most circuits discussed in this text use logic gates as the basic building blocks. A good 
model for representing faults in such circuits is to assume that all faults manifest themselves 
as some wires (inputs or outputs of gates) being permanently stuck at logic value 0 or 1. 
We indicate that a wire, w, has an undesirable signal that always corresponds to the logic 
value O by saying that w is stuck-at-O0, which is denoted as w/0. If w has an undesirable 
signal that is always equal to logic 1, then w is stuck-at-1, which is denoted as w/1. 

An obvious example of a stuck-at fault is when an input to a gate is incorrectly connected 
to a power supply, either Vpp or ground. But the stuck-at model is also useful for dealing 
with faults of other types, which often cause the same problems as if a wire were stuck at 
a particular logic value. The exact impact of a fault in the circuitry that implements a logic 
gate depends on the particular technology used. We will restrict our attention to the stuck-at 
faults and will examine the testing process assuming that these are the only faults that can 
occur. 


11.2 COMPLEXITY OF A TEST SET 


11.1.2 SINGLE AND MULTIPLE FAULTS 


A circuit can be faulty because it has either a single fault or possibly many faults. Dealing 
with multiple faults is difficult because each fault can occur in many different ways. A 
pragmatic approach is to consider single faults only. Practice has shown that a set of tests 
that can detect all single faults in a given circuit can also detect the vast majority of multiple 
faults. 

A fault is detected if the output value produced by the faulty circuit is different from 
the value produced by the good circuit when an appropriate test is applied as input. Each 
test is supposed to be able to detect the occurrence of one or more faults. A complete set of 
tests used for a given circuit is referred to as the fest set. 


11.1.3 CMOS Circults 


CMOS logic circuits present some special problems in terms of faulty behavior. The 
transistors may fail in permanently open or shorted (closed) state. Many such failures 
manifest themselves as stuck-at faults. But some produce entirely different behavior. For 
example, transistors that fail in the shorted state may cause a continuous flow of current from 
Vpp to ground, which can create an intermediate output voltage that may not be determined 
as either logic 0 or 1. Transistors failing in the open state may lead to conditions where the 
output capacitor retains its charge level because the switch that is supposed to discharge it 
is broken. The effect is that a combinational CMOS circuit starts behaving as a sequential 
circuit. 

Specific techniques for testing of CMOS circuits are beyond the scope of this book. An 
introductory discussion of this topic can be found in references [1-3]. Testing of CMOS 
circuits has been the subject of considerable research [4-6]. We will assume that a test 
set developed using the stuck-at model will provide an adequate coverage of faults in all 
circuits. 
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11.2 COMPLEXITY OF A TEST SET 


There is large difference in testing combinational and sequential circuits. Combinational 
circuits can be tested adequately regardless of their design. Sequential circuits present a 
much greater challenge because the behavior of a circuit under test is influenced not only 
by the tests that are applied to the external inputs but also by the states that the circuit is 
in when the tests are applied. It is very difficult to test a sequential circuit designed by a 
designer who does not take its testability into account. However, it is possible to design 
such circuits to make them more easily testable, as we will discuss in Section 11.6. We will 
start by considering the testing of combinational circuits. 

An obvious way to test a combinational circuit is to apply a test set that comprises all 
possible input valuations. Then it is only necessary to check if the output values produced 
by the circuit are the same as specified in a truth table that defines the circuit. This approach 
works well for small circuits, where the test set is not large, but it becomes totally impractical 
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for large circuits with many input variables. Fortunately, it is not necessary to apply all 2” 
valuations as tests for an n-input circuit. A complete test set, capable of detecting all single 
faults, usually comprises a much smaller number of tests. 

Figure 11.1la gives a simple three-input circuit for which we want to determine the 
smallest test set. An exhaustive test set would include all eight input valuations. This 
circuit involves five wires, labeled a, b, c, d, and f in the figure. Using our fault model, 
each wire can be stuck either at 0 or 1. 

Figure 11.1b enumerates the utility of the eight input valuations as possible tests for 
the circuit. The valuation w;w2w3 = 000 can detect the occurrence of a stuck-at-1 fault on 
wires a, d, and f. In a good circuit this test results in the output f = 0. However, if any 
of the faults a/1, d/1, or f/1 occurs, then the circuit will produce f = | when the input 
valuation 000 is applied. The test 001 causes f = 0 in the good circuit, and it results in 


(a) Circuit 


Fault detected 


J 

v J 

v J 

J J Vv 

J 
if 
JV 
(b) 


Faults detected by the various input valuations 


Figure 11.1 Fault detection in a simple circuit. 


11.3. PATH SENSITIZING 


f = lif any of the faults a/1, b/1, d/1, orf /1 occurs. This test can detect the occurrence 
of four different faults. We say that it covers these faults. The last test, 111, can detect only 
one fault, f /0. 

A minimal test set that covers all faults in the circuit can be derived from the table by 
inspection. Some faults are covered by only one test, which means that these tests must be 
included in the test set. The fault b/1 is covered only by 001. The fault c/1 is covered only 
by 010. The faults b/0, c/0, and d/0 are covered only by 011. Therefore, these three tests 
are essential. For the remaining faults there is a choice of tests that can be used. Selecting 
the tests 001, 010, and 011 covers all faults except a/0. This fault can be covered by three 
different tests. Choosing 100 arbitrarily, a complete test set for the circuit is 


Test set = {001, 010, 011, 100} 


The conclusion is that all possible stuck-at faults in this circuit can be detected using four 
tests, rather than the eight tests that would be used if we simply tried to test the circuit using 
its complete truth table. 

The size of the complete test set for a given n-input circuit is generally much smaller 
than 2”. But this size may still be unacceptably large in practical terms. Moreover, deriving 
the minimal test set is likely to be a daunting task for even moderately sized circuits. 
Certainly, the simple approach of Figure 11.1 is not practical. In the next section we will 
explore a more interesting approach. 
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11.3. PATH SENSITIZING 


Deriving a test set by considering the individual faults on all wires in a circuit, as done in 
Section 11.2, is not attractive from the practical point of view. There are too many wires 
and too many faults to consider. A better alternative is to deal with several wires that form 
a path as an entity that can be tested for several faults using a single test. It is possible to 
activate a path so that the changes in the signal that propagates along the path have a direct 
impact on the output signal. 

Figure 11.2 illustrates a path from input w; to output f, through three gates, which 
consists of wires a, b, c, and f. The path is activated by ensuring that other paths in the 
circuit do not determine the value of the output f. Thus the input w. must be set to | so 
that the signal at b depends only on the value at a. The input w3 must be 0 so that it does 


Figure 11.2 —_A sensitized path. 
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not affect the NOR gate, and w4 must be | to not affect the AND gate. Then if w; = 0 the 
output will be f = 1, whereas w; = 1 will cause f = 0. Instead of saying that the path from 
w tof is activated, a more specific term is used in technical literature, which says that the 
path is sensitized. 

To sensitize a path through an input of an AND or NAND gate, all other inputs must 
be set to 1. To sensitize a path through an input of an OR or NOR gate, all other inputs 
must be 0. 

Consider now the effect of faults along a sensitized path. The fault a/0 in Figure 
11.2 will cause f = | even if w; = 1. The same effect occurs if the faults b/O or c/1 are 
present. Thus the test w;w2w3w4 = 1101 detects the occurrence of faults a/0, b/0, and c/1. 
Similarly, if w; = 0, the output should be f = 1. But if any of the faults a/1, b/1, or c/0 
is present, the output will be f = 0. Hence these three faults are detectable using the test 
0101. The presence of any stuck-at fault along the sensitized path is detectable by applying 
only two tests. 

The number of paths in a given circuit is likely to be much smaller than the number 
of individual wires. This suggests that it may be attractive to derive a test set based on the 
sensitized paths. This possibility is illustrated in the next example. 


Example 11.1 


PATH-SENSITIZED TESTS Consider the circuit in Figure 11.3. This circuit has five paths. 
The path w; — c —f is sensitized by setting w2 = | and w4 = 0. It doesn’t matter whether 
w3 is 0 or 1, because wz = | causes the signal on wire b to be equal to 0, which forces 
d = O regardless of the value of w3. Thus the path is sensitized by setting w2w3w4 = 1x0, 
where the symbol x means that the value of w3 does not matter. Now the tests w;w2w3w4 = 
01x0 and 11x0 detect all faults along this path. The second path, w2 — c — f, is tested using 
1000 and 1100. The path w. — b — d — ff is tested using 0010 and 0110. The tests for the 
path w3 — d —f are x000 and x010. The fifth path, w4 — f, is tested with 0x00 and 0x01. 
Instead of using all ten of these tests, we can observe that the test 0110 serves also as the 
test 01x0, the test 1100 serves also as 11x0, the test 1000 serves also as x000, and the test 
0010 serves also as x010. Therefore, the complete test set is 


Test set = {0110, 1100, 1000, 0010, 0x00, 0x01} 


Figure 11.3 = Circuit for Example 11.1. 


11.3. PATH SENSITIZING 


While this approach is simpler, it is still impractical for large circuits. But the concept of 
path sensitizing is very useful, as we will see in the discussion that follows. 
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11.3.1 DETECTION OF A SPECIFIC FAULT 


Suppose that we suspect that the circuit in Figure 11.3 has a fault where the wire b is stuck- 
at-1. A test that determines the presence of this fault can be obtained by sensitizing a path 
that propagates the effect of the fault to the output, f, where it can be observed. The path 
goes from b tod tof. It is necessary to set w3 = 1, w4 = 0, and c = 0. The latter can be 
accomplished by setting w; = 0. If b is stuck-at-1, then it is necessary to apply an input 
that would normally produce the value of 0 on the wire b, so that the output values in good 
and faulty circuits would be different. Hence wz must be set to 1. Therefore, the test that 
detects the b/1 fault is wiw2w3w4 = 0110. 

In general, the fault on a given wire can be detected by propagating the effect of the 
fault to the output, sensitizing an appropriate path. This involves assigning values to other 
inputs of the gates along the path. These values must be obtainable by assigning specific 
values to the primary inputs, which may not always be possible. Example 11.2 illustrates 
the process. 


FAULT PROPAGATION _ As the effect of a fault propagates through the gates along a 
sensitized path, the polarity of the signal will change when passing through an inverting 
gate. Let the symbol D denote a stuck-at-0 fault in general. The effect of the stuck-at-0 
fault will be unaltered when passed through an AND or OR gate. If D is on an input of an 
AND (OR) gate and the other inputs are set to 1 (0), then the output of the gate will behave 
as having D on it. But if D is on an input of a NOT, NAND, or NOR gate, then the output 
will appear to be stuck-at-1, which is denoted as D. 

Figure 11.4 shows how the effect of a fault can be propagated using the D and D 
symbols. Suppose first that there is a stuck-at-0 fault on wire b; that is, b/O. We want to 
propagate the effect of this fault along the path b — h —f. This can be done as indicated 
in Figure 11.4b. Setting g = | propagates the fault to the wire h. Then h appears to be 
stuck-at-1, denoted by D. Next the effect is propagated to f by setting k = 1. Since the last 
NAND also inverts the signal, the output becomes equal to D, which is equivalent to f/0. 
Thus in a good circuit the output should be 1, but in a faulty circuit it will be 0. Next we must 
ascertain that it is possible to have g = | and k = 1 by assigning the appropriate values to 
the primary input variables. This is called the consistency check. By setting c = 0, both g 
and k will be forced to 1, which can be achieved with w3 = wy = 1. Finally, to cause the 
propagation of the fault D on wire b, it is necessary to apply a signal that causes b to have 
the value 1, which means that either w; or w2 has to be 0. Then the test w;yw2w3w4 = 0011 
detects the fault b/0. 

Suppose next that the wire g is stuck-at-1, denoted by D. We can try to propagate the 
effect of this fault through the path g — h —f by setting b = 1 andk = 1. Tomake b= 1, 


Example 11.2 
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(a) Circuit 
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(b) Detection of b/0 fault 
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(c) Detection of g/1 fault 


Figure 11.4 _ Detection of faults. 


we set w] = W2 = 0. To make k = 1, we have to make c = 0. But it is also necessary to 
cause the propagation of the D fault on g by means of a signal that makes g = 0 in the good 
circuit. This can be done only if b = c = 1. The problem is that at the same time we need 
c = 0, to make k = 1. Therefore, the consistency check fails, and the fault g/1 cannot be 
propagated in this way. 

Another possibility is to propagate the effect of the fault along two paths simultaneously, 
as shown in Figure 11.4c. In this case the fault is propagated along the paths g — h — f and 


11.4 CIRCUITS WITH TREE STRUCTURE 


g —k —f. This requires setting b = | and c = 1, which also happens to be the condition 
needed to cause the propagation as explained above. The test 0000 achieves the desired 
objective of detecting g/1. Observe that if D (or D) appears on both inputs of a NAND 
gate, the output value will be D (or D). 

The idea of propagating the effect of faults using path sensitizing has been exploited in 
a number of methods for derivation of test sets for fault detection. The scheme illustrated 
in Figure 11.4 indicates the essence of the D-algorithm, which was one of the first practical 
schemes developed for fault detection purposes [7]. Other techniques have grown from this 
basic approach [8]. 
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11.4 CIRCUITS WITH TREE STRUCTURE 


Circuits with a treelike structure, where each gate has a fan-out of |, are particularly easy 
to test. The most common forms of such circuits are the sum-of-products or product-of- 
sums. Since there is a unique path from each primary input to the output of the circuit, it is 
sufficient to derive the tests for faults on the primary inputs. We will illustrate this concept 
by means of the sum-of-products circuit in Figure 11.5. 

If any input of an AND gate is stuck-at-0, this condition can be detected by setting all 
inputs of the gate to 1 and ensuring that the other AND gates produce 0. This makes f = 1 
in the good circuit, and f = 0 in the faulty circuit. Three such tests are needed because 
there are three AND gates. 

Testing for stuck-at-1 faults is slightly more involved. An input of an AND gate is 
tested for the stuck-at-1 fault by driving it with the logic value 0, while the other inputs of 
the gate have the logic value 1. Thus a good gate produces the output of 0, and a faulty 


al 


W3 


Figure 11.5 = Circuit with a tree structure. 
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Figure 11.6 _ Derivation of tests for the circuit in Figure 11.5. 


gate generates |. At the same time, the other AND gates must have the output of 0, which 
is accomplished by making at least one input of these gates equal to 0. 

Figure 11.6 shows the derivation of the necessary tests. The first three tests are for 
the stuck-at-0 faults. Test 4 detects a stuck-at-1 fault on either the first input of the top 
AND gate or the third inputs of the other two gates. Observe that in each case the tested 
input is driven by logic 0, while the other inputs are equal to 1. This yields the test vector 
w1W2Ww3W4 = 0100. Clearly, it is useful to test inputs on as many gates as possible using a 
single test vector. Test 5 detects a fault on either the second input of the top gate or the first 
input of the bottom gate; it does not test any inputs of the middle gate. The required test 
pattern is 1110. Three more tests are needed to detect stuck-at-1 faults on the remaining 
inputs of the AND gates. Therefore, the complete test set is 


Test set = {1000, 0101, 0111, 0100, 1110, 1001, 1111, 0011} 


| 11.5 RANDOM TESTS 


So far we have considered the task of deriving a deterministic test set for a given circuit, 
primarily relying on the path-sensitizing concept. In general, it is difficult to generate such 
test sets when circuits become larger. A useful alternative is to choose the tests at random, 
which we will explore in this section. 

Figure 11.7 gives all functions of two variables. For an n-variable function, there are 
92" possible functions; hence there are 22? = 16 two-variable functions. Consider the XOR 
function, implemented as shown in Figure 11.8. Let us consider the possible stuck-at-0 and 
stuck-at-1 faults on wires b, c, d, h, and k in this circuit. Each fault transforms the circuit 
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wiwa| fo fi kh fbf fe fs fo fh fs fo fio fu fio fis fia fis 
00 0 0 0 0 0 0 0 0 1 =21 1 1 1 1 1 1 
Ol 0 0 0 0 1 1 1 1 +0 0 0 0 1 1 1 1 
10 0 0 1 1 0 0 1 +1 =+=0 +0 1 1 0 0) 1 1 
11 0 1 0 1 0 1 0 1 0 41 0 1 0 1 0 1 


Figure 11.7. All two-variable functions. 


Figure 11.8 — The XOR circuit. 


into a faulty circuit that implements a function other than XOR, as indicated in Figure 11.9. 
To test the circuit, we can apply one or more input valuations to distinguish the good circuit 
from the possible faulty circuits listed in Figure 11.9. Choose arbitrarily w;w2 = 01 as the 
first test. This test will distinguish the good circuit, which must generate f = 1, from the 
faulty circuits that realize fo, fo, f3, and fi9, because each of these would generate f = 0. 
Next, arbitrarily choose the test wjw2 = 11. This test distinguishes the good circuit from the 
faulty circuits that realize fs, f7, and f(s, in addition to /;, which we have already tested for 
using w1W2 = O1. Let the third test be w;w2 = 10; it will distinguish the good circuit from 
J, and f\2. These three tests, chosen in a seemingly random way, detect all faulty circuits 
that involve the faults in Figure 11.9. Moreover, note that the first two tests distinguish 
seven of the nine possible faulty circuits. 

This example suggests that it may be possible to derive a suitable test set by selecting 
the tests randomly. How effective can random testing be? Looking at Figure 11.7, we 
see that any of the four possible tests distinguishes the correct function from eight faulty 
functions, because they produce different output values for this input valuation. These 
eight faulty functions detectable by a single test are one-half of the total number of possible 
functions (22-1 for the two-variable case). The test cannot distinguish between the correct 
function and the seven faulty functions that produce the same output value. The application 
of the second test distinguishes four of the remaining seven functions because they produce 
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Figure 11.9 The effect of various faults. 


an output value that is different from the correct function. Thus each application of a 
new test essentially cuts in half the number of faulty functions that have not been detected. 
Consequently, the probability that the first few tests will detect a large portion of all possible 
faults is high. More specifically, the probability that each faulty circuit can be detected by 
the first test is 

1 


_ . gy 8 
= a 2 Geo 


This is the ratio of the number of faulty circuits that produce an output value different from 
the good circuit, to the total number of faulty circuits. 

This reasoning is readily extended to n-variable functions. In this case the first test 
detects 2?"-! out of a total of 2?” — 1 possible faulty functions. Therefore, if m tests are 
applied, the probability that a faulty circuit will be detected is 


m 
1 


— pas 


i=1 


Py = 


This expression is depicted in graphical form in Figure 11.10. The conclusion is that random 
testing is very effective and that after a few tens of tests the existence of a fault is likely to 
be detected even in very large circuits. 

Random testing works particularly well for circuits that do not have high fan-in. If 
fan-in is high, then it may be necessary to resort to other testing schemes. For example, 
suppose that an AND gate has a large number of inputs. Then there is a problem with 
detecting stuck-at-1 faults on its inputs, which may not be covered by random tests. But it 
is possible to test for these faults using the approach described in Section 11.4. 
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Figure 11.10 _—_ Effectiveness of random testing. 


The simplicity of random testing is a very attractive feature. For this reason, coupled 
with good effectiveness of tests, this technique is often used in practice. 
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11.6 TESTING OF SEQUENTIAL CIRCUITS 


As seen in the previous sections, combinational circuits can be tested effectively, using 
either deterministic or random test sets. It is much more difficult to test sequential circuits. 
The presence of memory elements allows a sequential circuit to be in various states, and the 
response of the circuit to externally applied test inputs depends on the state of the circuit. 

A combinational circuit can be tested by comparing its behavior with the functionality 
specified in the truth table. An equivalent attempt would be to test a sequential circuit 
by comparing its behavior with the functionality specified in the state table. This entails 
checking that the circuit performs correctly all transitions between states and that it produces 
a correct output. This approach may seem easy, but in reality it is extremely difficult. A 
big problem is that it is difficult to ascertain that the circuit is in a specific state if the state 
variables are not observable on the external pins of the circuit, which is usually the case. 
Yet for each transition to be tested, it is necessary to verify with complete certainty that the 
correct destination state was reached. Such an approach may work for very small sequential 
circuits, but it is not feasible for practical-size circuits. A much better approach is to design 
the sequential circuits so that they are easily testable. 


11.6.1 DESIGN FOR TESTABILITY 
A synchronous sequential circuit comprises the combinational circuit that implements the 


output and next-state functions, as well as the flip-flops that hold the state information 
during a clock cycle. A general model for the sequential circuits is shown in Figure 6.85. 
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The inputs to the combinational network are the primary inputs, w, through w,, and the 
present state variables, y; through y,. The outputs of the network are the primary outputs, 
z, through z,,, and the next-state variables, Y; through Y;. The combinational network 
could be tested using the techniques presented in the previous sections if it were possible 
to apply tests on all of its inputs and observe the results on all of its outputs. Applying the 
test vectors to the primary inputs poses no difficulty. Also, it is easy to observe the values 
on the primary outputs. The question is how to apply the test vectors on the present-state 
inputs and how to observe the values on the next-state outputs. 

A possible approach is to include a two-way multiplexer in the path of each present-state 
variable so that the input to the combinational network can be either the value of the state 
variable (obtained from the output of the corresponding flip-flop) or the value that is a part 
of the test vector. A significant drawback of this approach is that the second input of each 
multiplexer must be directly accessible through external pins, which requires many pins if 
there are many state variables. An attractive alternative is to provide a connection that allows 
shifting the test vector into the circuit one bit at a time, thus trading off pin requirements 
for the time it takes to perform a test. Several such schemes have been proposed, one of 
which is described below. 


Scan-Path Technique 


A popular technique, called the scan path, uses multiplexers on flip-flop inputs to allow 
the flip-flops to be used either independently during normal operation of the sequential 
circuit, or as a part of a shift register for testing purposes. Figure 11.11 presents the general 
scan-path structure for a circuit with three flip-flops. A 2-to-1 multiplexer connects the D 
input of each flip-flop either to the corresponding next-state variable or to the serial path 
that connects all flip-flops into a shift register. The control signal Normal /Scan selects the 
active input of the multiplexer. During the normal operation the flip-flop inputs are driven 
by the next-state variables, Y;, Y2, and Y3. 

For testing purposes the shift-register connection is used to scan in the portion of each 
test vector that involves the present-state variables, y;, y2, and y3. This connection has Q; 
connected to Dj,,. The input to the first flip-flop is the externally accessible pin Scan-in. 
The output comes from the last flip-flop, which is provided on the Scan-out pin. 

The scan-path technique involves the following steps: 


1. The operation of the flip-flops is tested by scanning into them a pattern of Os and Is, 
for example, 01011001, in consecutive clock cycles, and observing whether the same 
pattern is scanned out. 


2. The combinational circuit is tested by applying test vectors on w1w2-- + Wnyiy2y3 and 
observing the values generated on Z12Z2 -- - ZmY1¥2Y3. This is done as follows: 


e¢ The y;y2y3 portion of the test vector is scanned into the flip-flops during three 
clock cycles, using Normal /Scan = 1. 


¢ The w\w2---w, portion of the test vector is applied as usual and the normal 
operation of the sequential circuit is performed for one clock cycle, by setting 
Normal /Scan = 0. The outputs z)Z2 - - - Z,, are observed. The generated values of 
Y, Y2Y3 are loaded into the flip-flops at this time. 


e The select input is changed to Normal /Scan = 1, and the contents of the 
flip-flops are scanned out during the next three clock cycles, which makes the 


11.6 TESTING OF SEQUENTIAL CIRCUITS 667 


Combinational 
circuit 


Scan-out 


Clock Scan-in. Normal /Scan 
Figure 11.11 Scan-path arrangement. 


Y, Y2Y3 portion of the test result observable externally. At the same time, the next 
test vector can be scanned in to reduce the total time needed to test the circuit. 


The next example shows a specific circuit that is designed for scan-path testing. 


Figure 6.75 shows a circuit that recognizes a specific input sequence, which was discussed Example 11.3 
in Section 6.9. The circuit can be made easily testable by modifying it for scan path as 
shown in Figure 11.12. The combinational part, consisting of four AND and two OR gates, 
is the same in both figures. 
The flip-flops can be tested by scanning through them a sequence of Os and Is as 
explained above. The combinational circuit can be tested by applying test vectors on w, 
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Resetn Clock Normal /Scan 


Figure 11.12 = Circuit for Example 11.3. 


y,, and y2. Let us use the random-testing approach, choosing arbitrarily four test vectors 
wyiy2 = 001, 110, 100, and 111. To apply the first test vector, the pattern y;y2 = O1 is 
scanned into the flip-flops during two clock cycles. Then for one clock cycle, the circuit 
is made to operate in the normal mode with w = 0. This essentially applies the vector 
wy y2 = 001 to the AND-OR circuit. The result of this test should be z = 0, Y; = 0, and 
Y, = 0. The value of z can be observed directly. The values of Y; and Y> are loaded into the 
respective flip-flops, and they are scanned out during the next two clock cycles. As these 
values are being scanned out, the next test pattern yyy. = 10 can be scanned in. Thus it 
takes five cycles to perform one test, but the last two cycles are overlapped with the second 
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test. The third and fourth tests are performed in the same way. The total time needed to 
perform all four tests is 14 clock cycles. 

The preceding approach is based on testing a sequential circuit by testing its combina- 
tional part using the techniques developed in the previous sections. The scan-path facility 
makes it also possible to test the sequential circuit by making it go through all transitions 
specified in the state table. The circuit can be placed into a given state simply by scanning 
into the flip-flops the valuation of the state variables that denotes this state. The result of 
the transition can be checked by observing the primary outputs and by scanning out the 
valuation that presents the destination state. We leave it to the reader to develop the details 
of this approach (see Problem 11.16). 

One limitation of the scan-path technique is that it does not work well if the asyn- 
chronous preset and reset features of the flip-flops are used during normal operation. We 
have already suggested that it is better to use synchronous preset and reset. If the designer 
wishes to use the asynchronous preset and reset capability, then a testable circuit can be 
designed using techniques such as the level-sensitive scan design [1, 9]. The reader can 
consult the references for a description of this technique. 
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Until now we have assumed that testing of logic circuits is done by externally applying the 
test inputs and comparing the results with the expected behavior of the circuit. This requires 
connecting external equipment to the circuit under test. An interesting question is whether 
it is possible to incorporate the testing capability within the circuit itself so that no external 
equipment is needed. Such built-in capability would allow the circuit to be self-testable. 
This section presents a scheme that provides the built-in self-test (BIST) capability. 

Figure 11.13 shows a possible BIST arrangement in which a test vector generator 
produces the test vectors that must be applied to the circuit under test. In Section 11.5 
we explained that randomly chosen test vectors give good results, with the fault coverage 
depending on the number of tests performed. For each test vector applied to the circuit, it is 
necessary to determine the required response of the circuit. The response of a good circuit 
may be determined using the simulator tool of a CAD system. The expected responses to 
the applied tests must be stored on the chip so that a comparison can be made when the 
circuit is being tested. 


Test Circuit Test 
vector under result 


generator test compressor 


Signature 


Figure 11.13 The testing arrangement. 
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A practical approach for generating the test vectors on-chip is to use pseudorandom 
tests, which have the same characteristics as random tests but are produced deterministically 
and can be repeated at will. The generator for pseudorandom tests is easily constructed 
using a feedback shift-register circuit. A small example of a possible generator is given in 
Figure 11.14. A four-bit shift register, with the signals from the first and fourth stages fed 
back through an XOR gate, generates 15 different patterns during successive clock cycles. 
If the shift register is set at the beginning to x3x2x,x9 = 1000, then the generated patterns 
are as shown in part (b) of the figure. Observe that the pattern 0000 cannot be used, because 
the circuit would be locked in this pattern indefinitely. 

The circuit in Figure 11.14 is representative of a class of circuits known as linear 
feedback shift registers (LFSRs). Using feedback from the various stages of an n-bit shift 
register, connected to the first stage by means of XOR gates, it is possible to generate a 
sequence of 2” — | patterns that have the characteristics of randomly generated numbers. 
Such circuits are used extensively in error-correcting codes. The theory of operation of these 
circuits is presented in a number of books [1—3, 10]. A table of the feedback connections for 


PRBS 


(a) Circuit 


(b) Generated sequence 


Figure 11.14 — Pseudorandom binary sequence generator (PRBSG). 
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various values of n, which generate a maximum-length pseudorandom sequence, is given 
in Peterson and Weldon [11]. 

The pseudorandom binary sequence generator (PRBSG) gives a simple method of 
generating tests. The required response of the circuit under test can be determined by using 
the simulator tool of the CAD system. The remaining question is how to check whether 
the circuit indeed produces the required response. It is not attractive to have to store a 
large number of responses to the tests on a chip that also includes the main circuit. A 
practical solution is to compress the results of the tests into a single pattern. This can 
be done using an LFSR circuit. Instead of just providing the feedback signals as the 
input, a compressor circuit includes the output signals produced by the circuit under test. 
Figure 11.15 shows a single-input compressor circuit (SIC), which uses the same feedback 
connections as the PRBSG of Figure 11.14. The input p is the output of a circuit under test. 
After applying a number of test vectors, the resulting values of p drive the SIC and, coupled 
with the LFSR functionality, produce a four-bit pattern. The pattern generated by the SIC 
is called a signature of the tested circuit for the given sequence of tests. The signature 
represents a single pattern that may be interpreted as a result of all the applied tests. It can 
be compared against a predetermined pattern to see if the tested circuit is working properly. 
Storing a single n-bit pattern for comparison purposes presents only a small overhead. The 
randomizing nature of the compressor circuits based on LFSRs provides a good coverage 
of patterns that may result from a faulty circuit [12]. 

If the circuit under test has more than one output, then an LSFR with multiple inputs 
can be used. Figure 11.16 illustrates how four inputs, po through p3, can be added to the 
basic circuit of Figure 11.14. Again the four-bit signature provides a good mechanism for 
distinguishing among different sequences of four-bit patterns that may appear on the inputs 
of this multiple-input compressor circuit (MIC). 

A complete BIST scheme for a sequential circuit may be implemented as indicated in 
Figure 11.17. The scan-path approach is used to provide a testable circuit. The test pat- 
terns that would normally be applied on the primary inputs W = w,w>-- - w, are generated 
internally as the patterns on X = x1x2---x,. Multiplexers are needed to allow switching 
from W to X, as inputs to the combinational circuit. A pseudorandom binary sequence 
generator, PRBSG-X, generates the test patterns for X. The portion of the tests applied 
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Figure 11.15 — Single-input compressor circuit (SIC). 
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Figure 11.16 — Multiple-input compressor circuit (MIC). 
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Figure 11.17 BIST in a sequential circuit. 
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via the next-state signals, y, is generated by the second PRBS generator, PRBSG-y. These 
patterns are scanned into the flip-flops as explained in Section 11.6. 

The test outputs are compressed using the two compressor circuits. The patterns on 
the primary outputs, Z = z;Z2---Zm, are compressed using the MIC circuit, and those on 
the next-state wires Y = Y,; Y2---Y,, by the SIC circuit. These circuits produce the Z- 
signature and Y-signature, respectively. The testing procedure is the same as given in 
Example 11.3 except that the comparison with the test result that a good circuit is supposed 
to give is done only once; at the end of the testing process the two signatures are com- 
pared with the stored patterns. Figure 11.17 does not show the circuitry needed to store 
these patterns and perform the comparison. Instead of storing the signature patterns of the 
required results as a part of the designed circuit, it is possible to shift out the contents of 
MIC and SIC shift registers onto two output pins and to perform the necessary compari- 
son with the expected signatures externally. Note that using signature testing in this way 
reduces the testing time significantly, compared to the time it would take to test the circuit 
by scanning out the results of individual tests and comparing them with predetermined 
patterns. 

The effectiveness of the BIST approach depends on the length of the LFSR generator 
and compressor circuits. Longer shift registers give better results [13]. One reason for 
failing to detect that the circuit under test may be faulty is that the pseudorandomly generated 
tests do not have perfect coverage of all possible faults. Another reason is that a signature 
generated by compressing the outputs of a faulty circuit may coincidentally end up being 
the same as the signature of the good circuit. This can occur because the compression 
process results in a loss of some information, such that two distinct output patterns may be 
compressed into the same signature. This is known as the aliasing problem. 


11.7.1 BUILT-IN LoGic BLOCK OBSERVER 


The essence of BIST is to have internal capability for generation of tests and for compression 
of the results. Instead of using separate circuits for these two functions, it is possible to 
design a single circuit that serves both purposes. Figure 11.18 shows the structure of a 
possible circuit, known as the built-in logic block observer (BILBO) [14]. This four-bit 
circuit has the same feedback connections as the circuit of Figure 11.14. 

The BILBO circuit has four modes of operation, which are controlled by the mode bits, 
M, and M>. The modes are as follows: 


e M,M) = 11—Normal system mode in which all flip-flops are independently controlled 
by the signals on inputs po through p3. In this mode each flip-flop may be used to 
implement a state variable of a finite state machine by using pp to p3 as yo to y3. 

e MM) = 00 — Shift-register mode in which the flip-flops are connected into a shift 
register. This mode allows test vectors to be scanned in, and the results of applied tests 
to be scanned out, if the control input G/S is equal to 1. If G/S = 0, then the circuit 
acts as the PRBS generator. 


e M,M>, = 10 — Signature mode in which a series of patterns applied on inputs po 
through p3 are compressed into a signature available as a pattern on go through q3. 


¢ M,M) = 01 — Reset mode in which all flip-flops are reset to 0. 
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Figure 11.18 — A four-bit built-in logic block observer (BILBO). 
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Figure 11.19 — Using BILBO circuits for testing. 


An efficient way of using BILBO circuits is presented in Figure 11.19. A combinational 
circuit can be tested by partitioning it into two (or more) parts. A BILBO circuit is used to 
provide inputs to one part and to accept outputs from the other part. The testing process 
involves a two-phase approach. First, BILBO! is used as a PRBS generator that provides 
test patterns for combinational network 1 (CN1). During this time BILBO2 acts as a 
compressor and produces a signature for the test. The signature is shifted out by placing 
BILBO? into the shift-register mode. Next, the roles of BILBO1 and BILBO2 are reversed, 
and the process is repeated to test CN2. 
The detailed steps in the testing process are 


1. Scan the initial test pattern into BILBO! and reset all flip-flops in BILBO2. 


2. Use BILBO1 as the PRBS generator for a given number of clock cycles and use 
BILBO? to produce a signature. 


3. Scan out the contents of BILBO2 and externally compare the signature; then scan 
into it the initial test pattern for testing CN2. Reset the flip-flops in BILBO1. 


4. Use BILBO2 as the PRBS generator for a given number of clock cycles and use 
BILBO! to produce a signature. 


5. Scan out the signature in BILBO! and externally compare it with the required pattern. 


The BILBO circuits are used in this way for testing purposes. At other times the normal 
system mode is used. 


11.7.2 SIGNATURE ANALYSIS 


We have explained the use of signatures in the context of implementing an efficient built- 
in testing mechanism. The main idea of compressing a long sequence of test results into 
a single signature was originally developed as the basis for an instrument manufactured 
by Hewlett-Packard in the 1970s, known as the Signature Analyzer [15]. Thus the name 
signature analysis was coined to refer to the testing schemes that use signatures to represent 
the results of applied tests. 
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Signature analysis is particularly suitable for digital systems that naturally include 
an ability to generate the desired test patterns. Such is the case with computer systems in 
which various parts of the system can be stimulated by test patterns produced under software 
control. 


11.7.3 BOUNDARY SCAN 


The testing techniques discussed in the previous sections are equally applicable to circuits 
that are implemented on single chips or on printed circuit boards that contain a number of 
chips. A circuit can be tested only if it is possible to apply the tests to it and observe the 
outputs produced. This involves having access to the primary inputs and outputs. 

When chips are soldered onto a printed circuit board, it often becomes impossible to 
attach test probes to pins. This hinders the testing process unless some indirect access to the 
pins is provided. The scan-path concept can be extended to the board level to deal with the 
problem. Suppose that each primary input or output pin on a chip is connected through a D 
flip-flop and that a provision is made for a test mode in which all flip-flops can be connected 
into a shift register. Then the test information can be scanned in and scanned out using the 
shift-register path, via two pins that serve as serial input and output. Connecting the serial 
output pin of one chip to the serial input pin of another chip results in the pins of all chips 
being connected into a board-wide shift register for testing purposes. This approach has 
become popular in practice and has been embodied into the IEEE Standard 1149.1 [16]. 


11.8 PRINTED CIRCUIT BOARDS 


Design and testing techniques presented in this book can be applied to any logic circuit, 
whether the circuit is realized on a single chip or its implementation involves a number 
of chips placed on a printed circuit board (PCB). In this section we discuss some practical 
issues that arise when one or more circuits that form a larger digital system are implemented 
on a PCB. 

Atypical PCB contains multiple layers of wiring. When the board is manufactured, the 
wiring pattern on each layer is generated. The layers are separated by insulating material 
and pressed together in sandwichlike fashion to form the board. Connections between 
different wiring levels are made through holes that are provided for this purpose. Chips 
and other components are then soldered to the top and possibly to the bottom layers. 

In preceding chapters we have discussed in considerable detail the CAD tools used 
for designing circuits that can be implemented on a single chip, such as an FPGA. For a 
multiple-chip implementation, we need a different set of CAD tools to design a PCB that 
incorporates the chips and connections needed to realize the complete digital system. Such 
tools are available from a number of companies, for example, Cadence Design Systems 
and Mentor Graphics. These tools can automatically determine where each chip should be 
placed on the PCB, but the designer can also specify the location of particular chips. This 
is called the placement process. Given a specific placement of chips and other components 
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(such as connectors and capacitors), the tools generate a layout for each layer of wiring 
traces that provide the required connections on the board. This process is referred to as 
routing. Again the designer can intervene and manually route some connections. However, 
since the number of connections can be in the tens of thousands, it is crucial to obtain a 
good automated solution. 

In addition to the design issues discussed in the previous chapters, a large circuit 
implemented on a PCB is subject to some other constraints. Signals on the wiring traces 
may be affected by noise problems caused by crosstalk, spikes in the power supply voltage, 
and reflections from the end points of long traces. 


Crosstalk 


Two closely spaced wires that run parallel to each other are capacitively coupled, and 
a pulse on one wire can induce a similar (but usually much smaller) pulse on the adjoining 
wire. This is referred to as crosstalk. Its existence is undesirable because it contributes to 
noise problems. 

When drawing timing diagrams, we usually draw ideal waveforms with sharp edges, 
which have well-defined voltage levels for the logic values 0 and 1. In an actual circuit the 
corresponding signals may depart significantly from the desired behavior. As explained in 
Appendix B, noise in a circuit can affect voltage levels, which can be troublesome. For 
example, if at some point in time the noise diminishes the value of a signal that should be 
at logic | to a level where this signal is interpreted by the next gate as being logic 0, then a 
malfunction in the circuit is likely to occur. Since the noise effects tend to be random, they 
are often difficult to detect. 

To minimize crosstalk, it is prudent to avoid having long wires running parallel in close 
proximity to each other. This may be difficult to achieve because of limited space on a PCB 
and the need to provide a large number of wires. Using additional layers (planes) of wiring 
helps in coping with crosstalk problems. 


Power Supply Noise 

When a CMOS circuit changes its state, there is a momentary flow of current in the 
circuit, which is manifested as a current pulse on the power supply (Vpp and Ground) wires. 
Since a wiring trace on a PCB has a small “line inductance,” such a current pulse causes a 
voltage spike (short pulse) on these lines. The cumulative effect of such voltage spikes can 
cause a malfunction of the circuit. 

The induced voltage spikes can be reduced significantly by connecting a small capacitor 
between the Vpp and Ground wires, in close proximity to the chip that causes the spikes 
to occur. Since these spikes have the characteristic of a very high frequency signal, the 
path through the capacitor is essentially a short circuit for them. Thus the voltage spikes 
“bypass” the power supply lines and do not affect other chips connected to the same lines. 
Such capacitors are called bypass capacitors. They do not affect the DC voltage on the 
power supply lines. 


Reflections and Terminations 


Wiring traces on a PCB act as simple wires in circuits when the clock frequency is 
low. However, at higher clock frequencies it becomes necessary to worry about so-called 
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transmission-line effects. When a signal propagates along a long wire, it is attenuated due 
to the small resistance of the wire, it picks up crosstalk that manifests itself as noise, and it 
may be reflected when it reaches the end of the wire. The reflection causes a problem if its 
effect does not die down before the next active clock edge. The discussion of transmission- 
line effects is beyond the scope of this book. We will only mention that the reflection of 
signals can be prevented by placing a suitable “termination” component on the line. This 
termination can be as simple as a resistor whose resistance matches the apparent resistance 
of the line, known as the characteristic impedance of the line. Other forms of termination 
are also possible. For details of such schemes, the reader may consult other references 
[17-18]. 


11.8.1 TESTING or PCBs 


The manufactured PCB has to be tested thoroughly. Flaws in the manufacturing process 
may cause some connections to be broken and others to be shorted by a solder blob that 
touches two adjacent wires. There may be problems caused by design errors that were not 
discovered during the design process. Finally, some chips and other components on the 
PCB may be defective. 


Power Up 

The first step is to turn on the power supply. In the worst case this may cause some 
chip to be destroyed because of a fatal short-circuit condition (in an extreme case a chip 
package may actually blow apart). Assuming that this is not the case, it is essential to check 
if any of the chips is becoming inordinately hot. Overheating is a symptom of a serious 
problem that must be corrected. 

It is also necessary to check that the power and ground connections are properly made 
on each chip and that the voltage level is as specified. 


Reset 

The next step is to reset all circuitry on the PCB to reach a predetermined starting 
point. This typically implies resetting the flip-flops, which is usually achieved by asserting 
a common reset line. It is important to verify that the starting state is correctly established. 


Low-Level Functional Testing 

Since practical circuits can be extremely complex, it is prudent to test the basic func- 
tionality first. A key test is to verify that the control signals are working correctly. 

Using the divide-and-conquer approach, simple functions are tested first, followed by 
the more complex ones. 


Full Functional Testing 

Having verified the operation of smaller subcircuits, it is necessary to test the func- 
tionality of the entire system on the PCB. The number of errors often depends on the 
thoroughness of the simulation done during the design process. In general, it is difficult 
to simulate large digital systems fully, so some errors are likely to be found on the PCB. 
Typical errors are due to 
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e Manufacturing errors, such as wrong wiring traces, blown components, or incorrect 
power supply voltage. 


e Incorrect specifications. 
e Designer’s misinterpretation of information on the data sheets that describe some chips. 


e Incorrect information on the data sheets provided by the chip manufacturer. 


As mentioned earlier, PCBs contain multiple layers of wiring. Each layer may have several 
thousands of wires in it. Finding and fixing errors can be a difficult and time-consuming 
task, especially if errors involve wires in internal (as opposed to the top or bottom) wiring 
layers. 

Timing 

It is next necessary to verify the timing of the circuit. A good strategy is to start with 
a slow clock. If the circuit works properly, then the clock frequency is gradually increased 
until the required operating frequency is reached. 

Timing problems arise because of propagation delays through various paths in a circuit. 
These delays are caused by the logic gates and the wiring that interconnects them. It is 
essential to ensure that all data inputs to flip-flops in the circuit are stable before the active 
edge of the clock signal arrives, as required by the setup time. 


Reliability 

A digital system is expected to operate reliably for a long time. Its reliability may be 
affected by several factors, such as timing, noise, and crosstalk problems. 

The timing of signals has to provide some safety margin to allow for small changes in 
propagation delays. If the timing is too tight, then it is likely that the circuit will operate 
correctly for some period of time, but will eventually fail because of a timing error. The 
timing of chips may change with temperature, so failures can occur if thermal constraints 
are not adhered to. Cooling is usually provided by means of fans. 


11.8.2 INSTRUMENTATION 


Testing of circuits implemented in PCBs requires some specialized instruments. 


Oscilloscope 

The details of individual signals can be examined using an oscilloscope. This instru- 
ment displays the voltage waveform of a signal, showing the potential problems with respect 
to propagation delay and noise. The waveform displayed on an oscilloscope shows the ac- 
tual voltage levels of the signal; it does not depict the simplified view of ideal waveforms 
that have perfectly square edges. If the user wants to see only the logic values of a signal 
(0 or 1), then a different type of instrument called a logic analyzer can be used. 


Logic Analyzer 

While an oscilloscope allows simultaneous examination of a few signals, a logic an- 
alyzer allows examination of tens or even hundreds of signals at the same time. It takes 
inputs from a set of points in the circuit, by means of probes attached to these points, and 
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digitizes and displays the detected signals in the form of waveforms on a screen. A powerful 
feature of the logic analyzer is that it has internal storage capable of recording a sequence 
of changes in the signals over a substantial period of time. Then any segment of this infor- 
mation can be displayed as desired by the operator. Typically, it is possible to record a few 
milliseconds’ worth of events, which involves many cycles of a normal digital clock. 

Looking at the waveforms taken when the circuit under test is working properly is not 
helpful in the debugging process. It is essential to see the waveforms generated when a 
malfunction takes place. The logic analyzer can be “triggered” to record a window of events 
that occurred before and after the trigger event. The user must specify the trigger event. 
For example, suppose that a malfunction is suspected to be caused by two control signals, 
A and B, being asserted at the same time, even though the design specification requires that 
these signals be mutually exclusive. A useful trigger point can then be established as the 
time when the AND of A and B has the value 1. Finding suitable trigger events can be 
difficult, and the user must rely on intuition and experience. 

To use a logic analyzer effectively, it must be possible to connect the probes to some 
useful (for testing purposes) points in the circuit. Thus it is important to provide such “test” 
points when a PCB is being designed. 


11.9 CONCLUDING REMARKS 


Manufactured products must be tested to ensure that they perform as expected. All of the 
techniques discussed in this chapter are relevant for this type of testing. The development 
of tests and the required responses is based on the assumption that the circuits are designed 
correctly. Thus it is the validity of the physical implementation that is being tested. 

Another aspect of testing occurs during the design process. The designer has to ascertain 
that the designed circuit meets the specifications. From the testing point of view, this poses 
a significant problem because there exists no provably good circuit that can be used to 
generate the desired tests. CAD tools are helpful in deriving tests for a designed circuit, but 
they cannot determine whether the circuit is indeed what the designer intended to achieve 
in terms of its functionality. A design error usually results in a circuit that has somewhat 
different functionality than required by the specification. 

Small circuits can be tested fully to verify their functionality. A combinational circuit 
can be tested to see if it performs according to its truth table. A sequential circuit can be 
tested by checking the transitions specified in the state table. This is much easier to do if the 
circuit is designed for testability, as explained in Section 11.6.1. Large circuits cannot be 
tested exhaustively, because a vast number of tests would have to be applied. In this case 
the designer’s ingenuity is needed to determine a manageable set of tests that will hopefully 
demonstrate the correctness of the circuit. 


*11.1 


PROBLEMS 


Derive a table similar to Figure 11.1b for the circuit in Figure P11.1 to show the coverage 
of the various stuck-at-0 and stuck-at-1 faults by the eight possible tests. Find a minimal 
test set for this circuit. 
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Figure P11.1 Circuit for Problem 11.1. 


11.2 Repeat Problem 11.1 for the circuit in Figure P11.2. 


Figure P11.2 — Circuit for Problem 11.2. 


*11.3 Devise a test to distinguish between two circuits that implement the following expressions 


f = 1X23 + X2N3X4q + Hy X2X4 + H1A3ZX4 
& = (% +.X2)(%3 + X4) 


11.4 Consider the circuit in Figure P11.3. Sensitize each path in this circuit to obtain a complete 
test set that comprises a minimum number of tests. 
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Figure P11.3 © Circuit for Problem 11.4. 


*11.5 For the circuit of Figure 11.4a, show the tests that can detect each of the faults: w,/0, w4/1, 
g/0, and c/1. 
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11.6 Suppose that the tests w}w2w3w4 = 0100, 1010, 0011, 1111, and 0110 are chosen randomly 
to test the circuit in Figure 11.3. What percentage of single faults are detected using these 
tests? 


11.7 Repeat Problem 11.6 for the circuit in Figure 11.44. 
11.8 Repeat Problem 11.6 for the circuit in Figure 11.5. 


*11.9 Consider the circuit in Figure P11.4. Are all single stuck-at-0 and stuck-at-1 faults in this 
circuit detectable? If not, explain why. 
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Figure P11.4 Circuit for Problem 11.9. 


11.10. Prove that in a circuit in which all gates have a fan-out of 1, any set of tests that detects all 
single faults on the input wires detects all single faults in the entire circuit. 


*11.11 The circuit in Figure P11.5 determines the parity of a four-bit data unit. Derive a minimal 
test set that can detect all single stuck-at-0 and stuck-at-1 faults in this circuit. Would your 
test set work if the XOR gates are implemented using the circuit in Figure 8.9c? Can your 
result be extended to a general case that involves n-bit data units? 
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Wi 
Figure P11.5 = Circuit for Problem 11.11. 


*11.12 Derive a test set that can detect all single faults in the decoder circuit in Figure 4.14c. 


11.13 List all single faults in the circuit in Figure 11.4a that can be detected using each of the tests 
W1W2w3W4 = 1100, 0010, and 0110. 
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11.14 — Sensitize each path in the combinational part of the circuit in Figure 11.12 to obtain a 
complete test set that comprises as few tests as possible. Show how your test set can be 
applied to test this circuit. How many clock cycles are needed to perform the necessary 
tests? 


11.15 Derive an ASM chart that represents the flow of control needed to test the circuit in Fig- 
ure 11.12. 


11.16 The circuit in Figure 11.12 provides an easily testable implementation of the FSM in Fig- 
ure 6.76. In Example 11.3 we showed how this circuit may be tested by testing the combi- 
national part using randomly chosen tests. A different approach to testing may be to attempt 
to determine whether the circuit actually realizes the functionality specified in the state table 
in Figure 6.76b. This can be done by making the circuit go through all transitions given 
in the state table. For example, after applying the Resetn = 0 signal, the circuit begins in 
state A. It must be verified that the circuit is indeed forced into state A by scanning out 
the expected valuation y2y; = 00. Next each transition must be checked. To verify the 
transition A — A if w = 0, it is necessary to make the input w equal to 0 and allow the 
normal operation to take place for one clock cycle by making Normal /Scan = 0. The value 
of the output z must be observed. This is followed by scanning out the values of y2 and y; 
to see if y2y; = 00. At the same time, the valuation for the next test should be scanned in. 
If this test involves verifying that B — A if w = 0, then the valuation y2y; = 01 is scanned 
in. This process continues until all transitions have been verified. 


Indicate in the form of a table the values of the signals Normal /Scan, Scan-in, Scan-out, 
w, and z, as well as the transition tested, for each clock cycle necessary to perform the 
complete test for this circuit. 


11.17 Write Verilog code that represents the circuit in Figure 11.12. 


11.18 Derive an ASM chart that describes the control needed to test a digital system that uses the 
BILBO structure in Figures 11.18 and 11.19. 
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A 


VERILOG REFERENCE 


This appendix describes the features of Verilog used in this book. It is meant to serve as a 
convenient reference for the reader, hence only brief descriptions are provided, along with 
examples. The reader is encouraged to first study the introduction to Verilog in Section 2.10. 

This appendix is not meant to be a comprehensive Verilog manual. While we discuss 
almost all the features of Verilog that are useful in the synthesis of logic circuits, we do not 
discuss many of the features that are useful only for simulation of circuits. Although the 
omitted features are not needed for any of the examples used in this book, a reader who 
wishes to learn more about Verilog can refer to specialized texts [1-7]. 


How to Write Verilog Code 

The tendency for the novice is to write Verilog code that resembles a computer program, 
containing many variables and loops. It is difficult to determine what logic circuit the CAD 
tools will produce when synthesizing such code. The task of a synthesis tool is to analyze 
a piece of Verilog code and determine, according to the semantics of the language, what 
circuit can be used to implement the code. Consider a code fragment such as 


if (s == 0) 
f= w0; 

else 
f=wl; 


We can understand the semantics by considering each statement in sequence, in the way 
that a simulation tool would. The code results in f being assigned the value of either wo or 
w , depending on the value of s. A synthesis tool would usually implement this behavior 
using a multiplexer circuit. 

In general, synthesis tools have to recognize certain structures in code, like the above 
multiplexer. From the practical point of view, this will work only if users write code that 
conforms to a commonly used style. The beginning Verilog user should therefore adopt 
the style of code recommended by experienced designers. This book contains more than 
120 examples of Verilog code that represent a wide range of logic circuits. In all of these 
examples the code is easily related to the described logic circuit. The reader is encouraged 
to adopt the same style of code. A good approach is to “write Verilog code that obviously 
represents the intended circuit.” 
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Although Verilog is a fairly straightforward language to learn and use, the novice 
designer will tend to make some common errors in syntax and semantics. A list of typical 
errors is given in Section A.15, as well as a set of guidelines that expert Verilog coders 
recommend as good style for writing clear and effective code. 

Once complete Verilog code is written for a particular design, it is useful to analyze the 
resulting circuit synthesized by the CAD tools. Much can be learned about Verilog, logic 
circuits, and logic synthesis through this process. 


| A.1 DOCUMENTATION IN VERILOG CODE 


Documentation can be included in Verilog code by writing a comment. A short comment 
begins with the double slash, //, and continues to the end of the line. A long comment can 
span multiple lines and is contained inside the delimiters /* and */. Examples of comments 
are 
// This is a short comment 
/*This is a long Verilog comment 
that spans two lines */ 


| A.2 WHITE SPACE 


White space characters, such as SPACE and TAB, and blank lines are ignored by the Verilog 
compiler. Multiple statements can be written on a single line, such as 


f=w0; if(s == 1) f=wl; 


Although legal, this code is hard to read and uses poor style. Placing each statement on a 
separate line, and using indentation within blocks of code, such as an if-else statement, are 
good ways to increase the readability of code. 


| A.3 SIGNALS IN VERILOG CODE 


In Verilog, a signal in a circuit is represented as a net or a variable with a specific type. The 
term net is derived from the electrical jargon, where it refers to the interconnection of two 
or more points in a circuit. A net or variable declaration has the form 


type [range] signal_name{, signal_name}; 


The square brackets indicate an optional field, and the curly brackets indicate that additional 
entries are permitted. We will use this syntax throughout the appendix. The signal_name 
is an identifier, as defined in the next section. Without the range field the declared net or 
variable is scalar and represents a single-bit signal. The range is used to specify vectors 
that correspond to multibit signals, as explained in Section A.6. 


A.5 SIGNAL VALUES, NUMBERS, AND PARAMETERS 
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A.4 IDENTIFIER NAMES 


Identifiers are the names of variables and other elements in Verilog code. The rules for 
specifying identifiers are simple: any letter or digit may be used, as well as the _ underscore 
and $ characters. There are two caveats: an identifier must not begin with a digit and it 
should not be a Verilog keyword. Examples of legal identifiers are f, x1, x_y, and Byte. 
Some examples of illegal names are Lx, +y, x*y, and 258. Verilog is case sensitive, hence 
k is not the same as K, and BYTE is not the same as Byte. 

For special purposes Verilog allows a second form of identifier, called an escaped iden- 
tifier. Such identifiers begin with the (\) backslash character, which can then be followed 
by any printable ASCII characters except white spaces. Examples of escaped identifiers 
are \123, \sig-name, and \a+b. Escaped identifiers should not be used in normal Verilog 
code; they are intended for use in code produced automatically when other languages are 
translated into Verilog. 


A.5 SIGNAL VALUES, NUMBERS, AND PARAMETERS 


Verilog supports scalar nets and variables that represent individual signals, and vectors that 
correspond to multiple signals. Each individual signal can have four possible values: 


0 = logic value 0 

1 = logic value 1 

zZ = tri-state (high impedance) 
x = unknown value 


The z and x values can also be denoted by the capital letters Z and X. The value x can be 
used to denote a don’t-care condition in Verilog code; the symbol ? can also be used for 
this purpose. The value of a vector variable is specified by giving a constant of the form 


[size][’radix]constant 
where size is the number of bits in the constant, and radix is the number base. Supported 


radices are 


d = decimal 


b = binary 
h = hexadecimal 
o = octal 


When no radix is specified, the default is decimal. If size specifies more bits than are needed 
to represent the given constant, then in most cases the constant is padded with zeros. The 
exceptions to this rule are when the first character of the constant is either x or z, in which 
case the padding is done using that value. Some examples of constants include 
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0 the number 0 

10 the decimal number 10 

’b10 the binary number 10 = (2)j 
*h10 the hex number 10 = (16) jo 
4’b100 the binary number 0100 = (4)10 
4’ bx an unknown 4-bit value xxxx 
8’b1000_0011 _ can be inserted for readability 
8’ hfx equivalent to 8’b1111_xxxx 


A.5.1 PARAMETERS 


A parameter associates an identifier name with a constant. Let the Verilog code include the 
following declarations: 


parameter n = 4; 
parameter SO = 2’b00, S1 = 2’b01, S2 = 2’b10, S3 = 2’b11; 


Then the identifier 1 can be used in place of the number 4, the name SO can be substituted 
for the value 2’b00, and so on. An important use of parameters is in the specification of 
parameterized subcircuits, which is described in Section A.12. 


| A.6 NET AND VARIABLE TYPES 


Verilog defines a number of types of nets and variables. These types are defined by the 
language itself, and user-defined types are not permitted. 


A.6.1 NETS 


A net represents a node in a circuit. To distinguish between different types of circuit nodes 
there exist several types of nets, called wire, tri, and a number of others that are not needed 
for synthesis, and are not used in this book. 

The wire type is employed to connect an output of one logic element in a circuit to an 
input of another logic element. The following are examples of scalar wire declarations. 


wire x; 
wire Cin, AddSub; 


A vector wire represents multiple nodes, such as 


wire [3:0] S; 
wire [1:2] Array; 


A.6 NET AND VARIABLE TYPES 


The square brackets are the syntax for specifying a vector’s range. The range [R,:R,] can 
be either increasing or decreasing, as shown. In either case, R, is the index of the most- 
significant (leftmost) bit in the vector, and R; is the index of the least-significant (rightmost) 
bit. The indices R, and R» can be either positive or negative integers. 

The net S can be used as a four-bit quantity, or each bit can be referred to individually 
as $[3], S[2], S[1], and S[O]. If a value is assigned to S such as S$ = 4’b0011, the result 
is $[3] = 0, S[2] = 0, S[1] = 1, and S[0] = 1. The assignment of a single bit in a vector 
to another net, such as f = S[O], is called a bit-select operation. A range of values from 
one vector can be assigned to another vector, which is called a part-select operation. If 
we assign Array = S[2:1], this produces Array[1] = S[2] and Array[2] = S[1]. The index 
used in a bit-select operation can involve a variable, such as S[i], while the indices used 
with a part-select operation have to be constant expressions, such as S[2:1]. 

The tri type denotes circuit nodes that are connected in a tri-state fashion. Examples 
of tri nets are 

tri z; 

tri [7:0] DataOut; 
These nets are treated in the same manner as the wire type, and they are used only to enhance 
the readability of code that includes tri-state gates. 


A.6.2 VARIABLES 


Nets provide a means for interconnecting logic elements, but they do not allow a circuit 
to be described in terms of its behavior. For this purpose, Verilog provides variables. A 
variable can be assigned a value in one Verilog statement, and it retains this value until it 
is overwritten in a subsequent assignment statement. There are two types of variables, reg 
and integer. Consider the code fragment 


Count = 0; 
for (k =0;k <4;k=k+1) 
if (S[k]) 


Count = Count + 1; 


The for and if statements are described in Section A.11. This code stores in Count the 
number of bits in S that have the value 1. Since it models the behavior of a circuit, Count 
has to be declared as a variable, rather than a simple wire. If Count has three bits, then the 
declaration is 


reg [2:0] Count; 
The keyword reg does not denote a storage element, or register. In Verilog code, reg 
variables can be used to model either combinational or sequential parts of a circuit. In our 
example, the variable k serves as a loop index. Such variables are declared as type integer 
in the statement 
integer k; 

Integer variables are useful for describing the behavior of a module, but they do not directly 
correspond to nodes in a circuit. In this book we use integers as loop control variables. 
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A.6.3 MEMORIES 


A memory is a two-dimensional array of bits. Verilog allows such a structure to be declared 
as a variable (reg or integer) that is an array of vectors, such as 


reg [7:0] R [3:0]; 


This statement defines R as four eight-bit variables named R[3], R[2], R[1], and R[O]. 
Two-level indexing, such as R[3][7], can be used. There is also support for higher 
dimension arrays. A three-dimensional array may be declared as 


reg [7:0] R [3:0] [1:0]; 


This statement defines a three-dimensional array of bits. 


A.7 OPERATORS 


Verilog has a large number of operators, as shown in Table A.1. The first column gives 
the category, the second column indicates how each operator is used, and the third column 
specifies the number of bits produced in the result. To aid in describing the table, we use 


Table A.1_ Verilog operators and bit lengths. 


Category Examples Bit Length 
Bitwise ~A, +A, —-A L(A) 
A&B, A|B, A~<B, A’~B MAX (L(A), L(B)) 
Logical 1A, A&&B, A||B 1 bit 
Reduction &A, ~&A, |A, ~|A, *~A, ~*A 1 bit 
Relational A==B, A!=B, A>B, A<B 1 bit 
AS=B8, A<=8 
Assam, AS 
Arithmetic A+B, A-—B, AxB, A/B MAX (L(A), L(B)) 
A&B 
Shift A<<B, A>>B L(A) 
Concatenate (AeA B} L(A) +--+ + L(B) 
Replication {B{A}} Bx L(A) 


Condition A?B:C MAX (L(B), L(C)) 


A.7 OPERATORS 


operands named A, B, and C, which may be either vectors or scalars. The syntax ~A means 
that the ~ operator is applied to the variable A, and the syntax L(A) means that the result 
has the same number of bits (length) as in A. 

Most of the operators in Table A.1 are also listed in Table 4.2 and described in detail 
in Section 4.6.5. The bitwise operators are 1’s complement (~), unary plus (++), 2’s com- 
plement (—), AND (&), OR (|), XOR (“), and XNOR (~* or “~). The bitwise operators 
produce multibit results, usually with the same number of bits as the operands. For example, 
ifA = ajdao, B = bj bo, and C = cjco, then the operation C = A & Bresults inc; = a; & by 
and cp = do & bo. Note that unary plus just denotes a positive number; it has no effect. 

The logical operators generate a one-bit result. They are NOT (!), AND (&&), and OR 
(||). If the operand of ! is a vector, then !A produces 1 (True) only if all bits in A are 0; 
otherwise !A gives 0 (False). The result of A && B is 1 if both A and B are nonzero, while 
A || B produces | unless both A and B are zero. If an operand is ambiguous (contains an 
x), the result is x. The logical operators are normally used in conditional statements such 
as if ((A < B) && (B < C)). 

The reduction operators use the same symbols as some of the bitwise operators, but have 
only one operand. The reduction &A produces the AND of all of the bits in A, while ~&A 
produces the NAND. Similarly, the other reduction operators produce single-bit Boolean 
results. 

The relational operators give a | (True) or O (False) result based on the specified 
comparison of A and B. For synthesis of logic circuits A and B are usually wire or reg 
types, and Verilog treats them as unsigned numbers. If integer variables are supported, 
they may be treated as signed numbers. The result of a relational operation is ambiguous 
(x) if either operand has any unspecified digits. An exception is for the === and !== 
operators, which check for equality and inequality, respectively. These operators compare 
equality of x and z digits in addition to 0 and 1 digits. 

Verilog includes the normal arithmetic operators +, —, *, and /. The modulus operator 
(%) is also included, but it is usually not supported for synthesis, except for use in calculating 
a compile-time constant. The operation A % B returns the remainder of the integer division 
A ~ B. Arithmetic operands of type wire and reg are treated as unsigned numbers. If the 
two operands are of unequal size, zero digits are padded on the left, and bits are truncated 
if the result has fewer digits than the largest operand. Integer variables are considered as 
2’s complement numbers. 

The << and >> operators perform logical shifts to the left and right, respectively. For 
a left shift, zeros are shifted into the LSB, while for a right shift, zeros are shifted into the 
MSB. For synthesis, the operand B should be a constant. 

The {,} concatenate operator allows vectors to be combined to produce a larger resulting 
vector. Any operand that is a constant must have a specified size, as in 4’b0011. Operands 
can be repeated multiple times by using the replication operator. The operation {{3{A}}, 
{2{B}}} is equivalent to {A, A, A, B, B}. The replication operator can be used to form an 
n-bit vector of digits: the operation {n{1’b1}} represents n ones. 

The last item in Table A.1 is the ? : conditional operator. The result of A ? B : C is 
equal to B if operand A evaluates to 1 (True); otherwise, the result is C. In the case that A 
evaluates to x, the conditional operator generates a bitwise output; each bit in the result is 
1 if the corresponding bits in both B and C are 1, 0 if these bits are 0, and x otherwise. 
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The precedence of Verilog operators follows similar rules as in arithmetic and Boolean 
algebra. For example, * has precedence over +, and & has precedence over |. A complete 
listing of the precedence rules is given in Table 4.3. 


| A.8 VERILOG MODULE 


A circuit or subcircuit described with Verilog code is called a module. Figure A.1 gives the 
general structure of a module declaration. The module has a name, module_name, which 
can be any valid identifier, followed by a list of ports. The term port is adopted from the 
electrical jargon, in which it refers to an input or output connection in an electrical circuit. 
The ports can be of type input, output, or inout (bidirectional), and can be either scalar or 
vector. Examples of ports are 


input Cin, x, y; 
input [3:0] X, Y; 
output Cout, s; 
inout [7:0] Bus; 
output [3:0] S; 


wire Cout, s; 
wire [7:0] Bus; 
reg [3:0] S; 


As shown, output and inout ports have an associated type. We assume that Cout, s, and 
Bus are nets in this example, while S is a variable. The wire declarations can actually be 


module module_name [(port_name{, port-name})]; 
[parameter declarations] 
[input declarations] 
[output declarations] 
[inout declarations] 
[wire or tri declarations] 
[reg or integer declarations] 
[function or task declarations] 
[assign continuous assignments] 
[initial block] 
[always blocks] 
[gate instantiations] 
[module instantiations] 
endmodule 


Figure A.1_ The general form of a module. 


A.8 VERILOG MODULE 


omitted, because Verilog assumes that signals are nets by default. However, any port used 
as a variable must be explicitly declared as such. 

Instead of using a separate statement to declare that the variable S is of reg type, we 
can include this declaration in the statement that specifies S to be an output, as follows: 


output reg [3:0] S; 


This is the style we use throughout the book. 

As Figure A.1 indicates, a module may contain any number of net (wire or tri) or 
variable (reg or integer) declarations, and a variety of other types of statements that are 
described later in this appendix. 

Figure A.2 gives the Verilog code for a module fulladd, which represents a full-adder 
circuit. (The full-adder is discussed in Section 3.2.) The input port Cin is the carry-in, 
and the bits to be added are the input ports x and y. The output ports are the sum, s, and 
the carry-out, Cout. The functionality of the full-adder is described with logic equations 
preceded by the keyword assign, which is discussed in Section A.10. 

There is usually more than one way to describe a given circuit using Verilog. Figure 
A.3 gives another version of the fulladd module, in which the functionality is specified by 
using the concatenate and addition operators. The statement 


assign {Cout, s} = x + y + Cin; 


assigns the least-significant bit in the result x + y + Cin to the output s and the most- 
significant bit to Cout. The circuits generated from the modules in Figures A.2 and A.3 are 
the same. 


module fulladd (Cin, x, y, s, Cout); 
input Cin, x, y; 
output s, Cout; 


assign s=x‘ y * Cin; 
assign Cout = (x & y) | (Cin & x) | (Cin & y); 


endmodule 
Figure A.2 A full-adder module. 
module fulladd (Cin, x, y, s, Cout); 
input Cin, x, y; 
output s, Cout; 
assign {Cout, s} =x +y + Cin; 
endmodule 


Figure A.3_ A full-adder module defined using the 
+ operator. 
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| A.9 GATE INSTANTIATIONS 


Verilog includes predefined modules that implement basic logic gates. These gates allow a 
circuit’s structure to be described using gate instantiation statements of the form 


gate_name [instance_name] (output_port, input_port{, input_port}); 


The gate_name specifies the desired type of gate, and the instance_name is any unique 
identifier. Each gate may have a different number of ports, with the output port listed first, 
followed by a variable number of input ports. An example of using gates to realize a full- 
adder is given in Figure A.4. The code defines four wire nets, z1 to z4, that connect the gates 
together, and each gate has a specified instance name. Figure A.5 shows a simpler version, 
in which instance names are not included and the declarations of z1 to z4 are omitted. Since 
the nets are not explicitly declared, they are implicitly assumed to be of type wire. 

The logic gates supported in Verilog are summarized in Table A.2. The second col- 
umn describes the function of each gate, and the rightmost column gives an example of 
instantiating the gate. Verilog allows gates with any number of inputs to be specified, but 
some CAD systems set practical limits. The notif and bufif gates represent tri-state buffers 
(drivers). The gate notifO is an inverting tri-state buffer with active-low enable, and notif1 
provides the same functionality with an active-high enable. The bufif0 and bufif1 gates are 
tri-state buffers that do not invert the output. 

For simulation purposes, it is possible to set a parameter of the gate that represents its 
propagation delay. As an example, the following statement instantiates a three-input AND 
gate with a delay of five time units (the units of time are determined by the simulator being 
used): 


and #(5) And3 (z, x1, x2, x3); 


This type of delay parameter has no meaning when using Verilog for synthesis of logic 
circuits. 


// Structural specification of a full-adder 
module fulladd (Cin, x, y, s, Cout); 
input Cin, x, y; 
output s, Cout; 
wire zl, z2, 23, 24; 


and And! (zl, x, y); 

and And? (z2, x, Cin); 
and And3 (23, y, Cin); 
or Orl (Cout, zl, z2, 23); 
xor Xorl (24, x, y); 

xor Xor2 (s, z4, Cin); 


endmodule 


Figure A.4 A full-adder described using gate instantiation. 
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// Structural specification of a full-adder 
module fulladd (Cin, x, y, s, Cout); 
input Cin, x, y; 
output s, Cout; 


and (z1, x, y); 

and (z2, x, Cin); 
and (z3, y, Cin); 

or (Cout, zl, z2, z3); 
xor (24, x, y); 

xor (s, 24, Cin); 


endmodule 


Figure A.5 _A simplified version of Figure A.4. 


Table A.2 Verilog gates. 


Name Description Usage 

and f=(@-b---) and (f,a,b,...) 
nand Ce eee) nand (f,a,b,...) 
or f =(a+b4+--:) or (f,a,b,...) 
nor f=(at+tb+--) nor (f,a,b,...) 
xor f=(a@be@.:--) xor (f,a,b,...) 
xnor =(a©OboO-::) xnor (f,a,b,...) 


not =a not (f, a) 


buf a) buf (f, a) 


notif1 = (e 2a: ’bz) notifl (f, a, e) 


bufifO =i (!e 2a: *bz) bufif0 (f, a, e) 


ui 
i 
i 
notif0  f = (le 2a: ’bz) notif0 (f, a, e) 
f 
i 
i 


bufif1 = (e?a: "bz) bufifl (f, a, e) 
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A.10 CONCURRENT STATEMENTS 


In any hardware description language, including Verilog, the concept of a concurrent state- 
ment means that the code may include a number of such statements, and each represents 
a part of the circuit. We use the word concurrent because the statements are considered 
in parallel and the ordering of statements in the code does not matter. Gate instantiations 
are one type of concurrent statements. This section introduces another type of concurrent 
statement, called the continuous assignment. 


A.10.1 CONTINUOUS ASSIGNMENTS 


While gate instantiations allow the description of a circuit’s structure, continuous assign- 
ments permit the description of a circuit’s function. The general form of this statement 
is 

assign net_assignment{, net_assignment}; 


The net_assignment can be any expression involving the operators listed in Table A.1. 
Examples of continuous assignments are 


assign Cout = (x & y) | (x & Cin) | (y & Cin); 
assigns = x“ y “z; 


Although they are not needed in terms of operator precedence, the parentheses in the ex- 
pression for Cout are included for clarity. Multiple assignments can be specified in one 
assign statement, using commas to separate the assignments, as in 


assign Cout = (x & y) | (x & Cin) | (y & Cin), 
s=x*y%Z; 
An example of a multibit assignment is 
wire [1:3] A, B, C; 


assign C = A & B; 


This results in cy = a,b,, cy = agbo, and cz = a3b3. 
The arithmetic assignment 


wire [3:0] X, Y, S; 


assign S = X + Y; 


represents a four-bit adder without carry-in and carry-out. If we declare a carry-in and 
carry-out, 


wire carryin, carryout; 


then the statement 


A.10 CONCURRENT STATEMENTS 


module adder_sign (X, Y, S, S2s); 
input [3:0] X, Y; 
output [7:0] S, S2s; 


assign S=X+/Y, 
S2s = {{4{X[3]}}, X} + ({4{YI3]}}, Y}; 


endmodule 


Figure A.6 — An example of arithmetic assignments and sign extension. 


assign {carryout, S} = X + Y + carryin; 


represents the four-bit adder with carry-in and carry-out. We mentioned in Section A.7 that 
Verilog treats the wire type as an unsigned number. Since a five-bit result is needed in 
{carryout, S}, each operand is padded with a zero. When using Verilog for synthesis, it is 
up to the compiler to determine, or infer, that a four-bit adder with carry-out is needed and 
to recognize the carry-in. 

A complete example of arithmetic assignments is given in Figure A.6. There are two 
four-bit inputs, X and Y, and two eight-bit outputs, S and S2s. To produce the eight-bit 
sum S = X + Y the Verilog compiler automatically pads X and Y with four zeros. The 
assignment to S2s shows how a signed (2’s complement) result can be generated. Recall 
from Section 3.3 that the leftmost bit in a 2’s complement number is the sign bit. The 
assignment to S2s uses the concatenate and replication operators to pad X and Y with four 
copies of their most-significant bit, thereby performing sign extensions. 

As an example, assume that X = 0011 and Y = 1101. The unsigned result is S = 
0011 + 1101 = 00010000, or S = 3 + 13 = 16. ThesignedresultisS2s = 0011 + 1101 = 
00000000, or S2s = 3 + (—3) = 0. 


A.10.2 UsING PARAMETERS 


Figure A.6 specifies an adder for four-bit numbers. We can make this code more general 
by introducing a parameter that sets the number of bits in the adder. Figure A.7 gives the 
code for an n-bit adder module, addern. The number of bits to be added is defined with the 
parameter keyword, introduced in Section A.5. The value of n defines the bit widths of X, 
Y, S,and 82s. 

It is possible to combine a continuous assignment with a wire declaration. For example, 
the sum, s, and carry-out, c, of a half-adder could be defined as 


wire s = x “y, 
c=x&y; 
Verilog allows parameters, such as delays, to be associated with continuous assign- 


ments. These parameters have no meaning for synthesis, but we mention them for com- 
pleteness. Consider the statement 
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module addern (X, Y, S, S2s); 
parameter n = 4; 
input [n—1:0] X, Y; 
output [2*n—1:0] S, S2s; 


assign S=X+/Y, 
S2s = {{n{X[n—1]}}, X} + {{n{Y[n—1]}}, Y}; 


endmodule 


Figure A.7 —_ Using a parameter in an n-bit adder. 


wire #8 s= x" y, 
#5c=x &y; 


These assignments specify that the operation x “ y has an associated propagation delay of 
eight time units, and x & y has a delay of five units. Such delays, which are useful only for 
simulation purposes, can also be associated with wires, as in 


wire #2c; 


assign #5 c =x & y; 


This code specifies that two time units of delay are incurred on the wire c in addition to the 
five time units for the AND gate that produces x & y. 


A.11 PROCEDURAL STATEMENTS 


In addition to the concurrent statements described in the previous section, Verilog also 
provides procedural statements (also called sequential statements). Whereas concurrent 
statements are executed in parallel, procedural statements are evaluated in the order in which 
they appear in the code. Verilog syntax requires that procedural statements be contained 
inside an always block. 


A.11.1 ALWAYS AND INITIAL BLOCKS 


An always block is a construct that contains one or more procedural statements. It has the 
form 


A.11 PROCEDURAL STATEMENTS 


always @(sensitivity_list) 

[begin] 
[procedural assignment statements] 
[if-else statements] 
[case statements] 
[while, repeat, and for loops] 
[task and function calls] 

[end] 


Verilog includes several types of procedural statements. These statements permit the de- 
scription of a circuit in terms of its behavior in a much more powerful way than is possible 
with continuous assignments or gate instantiations. 

When multiple statements are included in an always block, the begin and end keywords 
are needed; otherwise, these keywords can be omitted. The begin and end keywords are 
also used with other Verilog constructs. We refer to the statements delimited by begin and 
end as a begin-end block. 

The sensitivity_list is a list of signals that directly affect the output results generated 
by the always block. A simple example of an always block is 


always @(x, y) 


begin 
s=x’ y; 
c=x&y; 
end 


Since the output variables s and c depend on x and y, these signals are included in the 
sensitivity list, separated by a comma or by the keyword or. We use the comma in this 
book, but it should be noted that in the original version of Verilog it was necessary to use 
the keyword or. When specifying a combinational circuit by using an always block, it is 
possible to write simply 


always @x« 


which indicates that all input signals used in the always block are included in the sensitivity 
list. In the examples in this book we explicitly show the signals in the sensitivity list to 
make it easier to understand the Verilog code. 

The semantics of the always block are as follows: if the value of a signal in the 
sensitivity list changes, then the statements inside the always block are evaluated in the 
order presented. 

For simulation purposes, Verilog also provides the initial construct. The initial and 
always constructs have the same form, but the statements inside the initial construct are 
executed only once, at the start of a simulation. Initial constructs are not meaningful for 
synthesis, hence we do not discuss them further. 

A Verilog module may include several always blocks, each representing a part of the 
circuit being modeled. While the statements inside each always block are evaluated in 
order, there is no meaningful order among the different always blocks. In this sense, each 
entire always block can be considered as a concurrent statement, because a Verilog compiler 
evaluates all always blocks concurrently. 
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Procedural Assignment Statements 


Any signal assigned a value inside an always block has to be a variable of type reg 
or integer. A value is assigned to a variable with a procedural assignment statement. 
There are two kinds of assignments: blocking assignments, denoted by the = symbol, and 
non-blocking assignments, denoted by the <= symbol. The term blocking means that 
the assignment statement completes and updates its left-hand side before the subsequent 
statement is evaluated. This concept is best explained in the context of simulation. Consider 
the blocking assignments 


S=X-+Y; 
p= S[0]; 


At simulation time t; the statements are evaluated, in order. The first statement sets S$ 
using the current values of X and Y, and then the second statement sets p according to this 
new value of S. Verilog also provides non-blocking assignments, specified as 


S<=X-4+Y; 
p <= S[0]; 


In this case, at simulation time ¢; the statements are still evaluated in order, but they both 
use the values of variables that exist at the start of the simulation time t;. The first statement 
determines a new value for S based on the current values of X and Y, but S is not actually 
changed to this value until all statements in the associated always block have been evaluated. 
Therefore, the value of p at time ¢; is based on the value of S at time ¢;_1. We can summarize 
the difference between blocking and non-blocking assignments as follows. For blocking 
assignments, the values of variables seen at time ¢; by each statement are the new values 
set in f; by any preceding statements in the always block. For non-blocking assignments, 
the values of variables seen at time f; are the values set in time f;_1. 

Although we introduced the concepts of blocking and non-blocking assignments in 
the context of simulation, the semantics are the same for synthesis. For combinational 
circuits, only blocking assignments should be used, as we will explain in Section A.11.7. 
We give a number of examples of combinational circuits in the following sections and also 
introduce the if-else, case, and loop statements. Section A.14 focuses on sequential circuits 
and explains that they should be designed with non-blocking assignments. 


A.11.2 THe Ir-ELSE STATEMENT 


The general form of the if-else statement is given in Figure A.8. If expression1 is True, 
then the first statement is evaluated. When multiple statements are involved, they have to 
be included inside a begin-end block. 

The else if and else clauses are optional. Verilog syntax specifies that when else if or 
else are included, they are paired with the most recent unfinished if or else if. 
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if (expression 1) 

begin 
statement; 

end 

else if (expression2) 

begin 
statement; 

end 

else 

begin 
statement; 

end 


Figure A.8 The form of the if-else statement. 


An example of an if-else statement used for combinational logic is 


always @(w0, wl, s) 


if (s == 0) 
f= w0; 

else 
f=wl; 


which defines a 2-to-1 multiplexer with data inputs wo and w, select input s, and output f. 


A.11.3 STATEMENT ORDERING 


Another way of describing the 2-to-1 multiplexer with an if-else statement is presented in 
Figure A.9. Instead of using an else clause, this code first makes the default assignment 
jf = wo and then changes this assignment to f = w; if s has the value 1. The Verilog 
semantics specify that a signal assigned multiple values in an always construct retains the 
last assignment. This example highlights the importance of the ordering of statements in 
an always block. If the statements are reversed, as in 


always @(w0, wl, s) 
begin 
if(s==1) 
f=wl; 
f= w0; 
end 


then the if statement would be evaluated first and the assignment f = wo would be per- 
formed last. Hence, the code would always result in f being set to the value of wo. 


701 


702 


APPENDIX A~ e_ VERILOG REFERENCE 


module mux2tol (w0, wl, s, f); 
input w0, wl, s; 
output reg f; 


always @(w0, wl, s) 


begin 
f = w0; 
if (s == 1) 
f=wl; 
end 
endmodule 


Figure A.9 = Code for a 2-to-1 multiplexer. 


Implied Memory 
Consider the always block 


always @( w0, wl,s ) 


begin 
if (s == 1) 
f=wl; 
end 


This is the same as the code in Figure A.9, except that the default assignment f = w0; has 
been removed. Since the code does not specify a value for the variable f when s is 0, the 
Verilog semantics specify that f must retain its current value. A synthesized circuit has to 
implement the functionality 


f=s wts-f 


Hence, when s = 0, the value of w; is “remembered” by a latch at the output f. This effect 
is called implied memory. We will show shortly that implied memory is the key concept 
used to describe sequential circuits. 


A.11.4 THE CASE STATEMENT 


The form of a case statement is illustrated in Figure A.10. The bits in expression, called the 
controlling expression, are checked for a match with each alternative. The first successful 
match causes the associated statements to be evaluated. Each digit in each alternative 
is compared for an exact match of the four values 0, 1, x, and z. A special case is the 
default clause, which takes effect if no other alternative matches. When using Verilog for 
simulation, an alternative can be a general expression, but for synthesis these items are 
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case (expression) 
alternativel: begin 
statement; 
end 
alternative2: begin 
statement; 
end 
[default: begin 
statement; 
end] 
endcase 


Figure A.10 The general form of the ease statement. 


restricted to a single constant, such as 1’b0:, or a list of constants separated by commas, 
such as 1, 2, 3:. 
An example of a case statement is 


always @(w0, wl, s) 


case (s) 
1’b0: f= wO; 
Vbl: f=wl; 
endcase 


This code represents the same 2-to-1 multiplexer described in Section A.11.2 using the 
if-else statement. When using Verilog for simulation, it is necessary to give alternatives for 
all possible valuations of the controlling expression. A default has to be included for any 
valuations not explicitly covered by the listed alternatives. In this example, s can have the 
four values 0, 1, x, or z; hence, we could include a default to handle the cases s = x and 
s = z. We have not included the default clause here because we are concerned only with 
synthesis, and the synthesis tools require only the bit values 0 and | to be considered. 

Figure A.11 demonstrates the use of a case statement to specify truth tables. This code 
represents the same full-adder that is described in Figures A.2 and A.3. The controlling 
expression in the case statement is the concatenated bits {Cin, x, y}, and the alternatives 
correspond to the rows of the truth table in Figure 3.3a. 

The case statement is also important for representing some types of sequential circuits, 
such as finite state machines, which are discussed in Section A.14. 


A.11.5 CASEZ AND CASEX STATEMENTS 


In the case statement, the values x or z in an alternative are checked for an exact match with 
the same values in the controlling expression. The casez statement adds more flexibility, 
by treating a z digit in an alternative as a don’t-care condition. The casex statement treats 
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// Full adder 

module fulladd (Cin, x, y, s, Cout); 
input Cin, x, y; 
output reg s, Cout; 


always @(Cin, x, y) 
begin 
case ( {Cin, x, y} ) 
3’b000: {Cout, s} = ’b00; 
3’b001: {Cout, s} = ’b01; 
3’b010: {Cout, s} = ’b01; 
3’bO11: {Cout, s} = ’b10; 
3’b100: {Cout, s} = ’b01; 
3’b101: {Cout, s} = ’b10; 
3’b110: {Cout, s} = ’b10; 
3’b111: {Cout, s} =’b11; 
endcase 
end 


endmodule 


Figure A.11 — Using a ease statement to specify a truth table. 


both x and z as don’t cares. The alternatives do not have to be mutually exclusive. If they 
are not, then the first matching item has priority. Figure A.12 shows how casex can be used 
to describe a priority encoder with the 4-bit input W and the outputs Y and f. The priority 
encoder is defined in Figure 4.20 (the output z in Figure 4.20 is named f in the Verilog code 
in Figure A.12). The first alternative, 1xxx, specifies that if w3 has the value 1, then the 
other inputs are treated as don’t cares, hence the output is set to Y = 3. Similarly, the other 
alternatives describe the desired priority scheme. 


A.11.6 Loop STATEMENTS 


Verilog includes four types of loop statements: for, while, repeat, and forever. Synthesis 
tools typically support the for loop, which has the general form 


for (initial_index; terminal_index; increment) 
begin 

statement; 
end 


This syntax is very similar to the for loop in the C programming language. The initial_index 
is evaluated once, before the first loop iteration, and typically performs the initialization 
of the integer loop control variable, such as k = 0. In each loop iteration, the begin-end 
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module priority (W, Y, f); 
input [3:0] W; 
output reg [1:0] Y; 
output f; 


assign f =(W !=0); 
always @(W) 
begin 
casex (W) 
*bI xxx: 
*bO1xx: 
*bOO1x: 
default: 
endcase 
end 


endmodule 


Figure A.12 A priority encoder described using a 
casex statement. 


block is performed, and then the increment statement is evaluated. A typical increment 
statement is k = k + 1. Finally, the terminal_index condition is checked, and if it is True 
(1), then another loop iteration is done. For synthesis, the terminal_index condition has to 
compare the loop index to a constant value, such as k < 8. 

An example of using a for loop to describe an n-bit ripple-carry adder is presented in 
Figure A.13. The effect of the loop is to repeat its begin-end block for the specified values 
of k. In this example, each loop iteration, k, defines a full-adder with the inputs x;, y,, and 
cx, and the outputs s; and c+. It is possible to define the integer k (parameters can also 
be defined in this way) inside the always block if the begin-end block has a label. For 
example, 


always @(X, Y, carryin) 
begin: fulladders 
integer k; 
C[0] = carryin; 
for (k = 0; k <=n—1;k =k+1) 
begin 
S[k] = X[k] “ Y[k] * C[k]; 
C[k+1] = (X[k] & Y[k]) | (C[k] & X[k]) | (C[k] & Y[k]); 
end 
carryout = C[n]; 
end 
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module ripple (carryin, X, Y, S, carryout); 
parameter n = 4; 
input carryin; 
input [n—1:0] X, Y; 
output reg [n—1:0] S; 
output reg carryout; 
reg [n:0] C; 
integer k; 


always @(X, Y, carryin) 
begin 
C[0] = carryin; 
for (k = 0; k <= n-1;k =k+1) 
begin 
Stk] = X[k] “ Y[k] “ C[k]; 
C[k+1] = (X[k] & Y[k]) | (C[k] & X[k]) | (C[k] & Y[k]); 
end 
carryout = C[n]; 
end 


endmodule 


Figure A.13. 0 An n-bit ripple-carry adder using a for loop. 


A second for-loop example is given in Figure A.14. This code produces a count of the 
number of bits in the n-bit input X that have the value 1. Unrolling the loop, the first two 
iterations give 


Count = Count + X[0]; 
Count = Count + X[1]; 


The first statement produces the value Count = 0 + X [0] = X [0]. The second assignment 
then gives Count = X [0] + X[1], and so on for the other loop iterations. At the end of the 
loop, we have 


Count = X[0] + X[1] +---+ X[n—1] 


Asynthesis tool generates a circuit that has a cascade of adders to implement this expression. 
For example, for n = 3 a possible circuit that employs two-bit adders is shown in Figure 
A.15. 

Figure A.16 shows the general forms of the while and repeat loops. The while loop 
has the same structure as the corresponding statement in the C language, and the repeat 
loop simply specifies a number of times to repeat its begin-end block. The forever loop, 
not shown in the figure, loops endlessly. 
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module bit_count (X, Count); 
parameter n = 4; 
parameter logn = 2; 
input [n—1:0] X; 
output reg [logn:0] Count; 
integer k; 
always @(X) 
begin 
Count = 0; 
for (k = 0;k <n; k=k+1) 
Count = Count + X[k]; 
end 


endmodule 


Figure A.14 A bit-counting example. 


0 X{1] O X[0] 


x) % Yr Yo 


2-bit adder 


x, Xo ¥1 Yo 


2-bit adder 


Count [1] Count [0] 


Figure A.15 A circuit that implements the code in 
Figure A.14. 


707 


708 APPENDIX A~ e_ VERILOG REFERENCE 


while (condition) 
begin 

statement; 
end 


repeat (constant_value) 
begin 

statement; 
end 


Figure A.16 The general forms of the while and 
repeat statements. 


A.11.7 BLOCKING VERSUS NON-BLOCKING ASSIGNMENTS FOR 
COMBINATIONAL CIRCUITS 


All our previous examples of combinational circuits used blocking assignments, which is 
a good way to design such circuits. A natural question is whether combinational circuits 
can be described using non-blocking assignments. The answer is that this would work in 
many cases, but if subsequent assignments depend on the results of preceding assignments, 
non-blocking assignments can produce nonsensical combinational circuits. As an example, 
consider changing the for loop in Figure A. 14 to use non-blocking assignments, as indicated 
in Figure A.17. For simplicity assume that n = 3, so that the unrolled loop is 


Count <= Count + X[0]; 
Count <= Count + X[1]; 
Count <= Count + X[2]; 


Since non-blocking assignments are involved, each subsequent assignment statement sees 
the starting value of Count = 0 rather than a new Count value produced by the previous 
statements. The for loop thus degenerates to 


Count <= 0 + X[0]; 
Count <= 0+ X[1]; 
Count <= 0 + X[2]; 


When there are multiple assignments to the same variable in an always block, Verilog 
semantics specify that the variable retains its last assignment. Therefore, the code produces 
the wrong result Count = X [2]. 


A.12. USING SUBCIRCUITS 


always @(X) 
begin 
Count = 0; 
for (k =0;k <n; k=k+1) 
Count <= Count + X[k]; 
end 


Figure A.17 —_ Using non-blocking assignments for a 
combinational circuit. 
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A Verilog module can be included as a subcircuit in another module. For this to work, 
both modules must be defined in the same file or else the Verilog compiler must be told 
where each module is located (the mechanism for doing this varies from one compiler to the 
next). The general form of a module instantiation statement is similar to a gate instantiation 
statement 


module_name [#(parameter overrides)] instance_name ( 
._port_name ( [expression] ) {, .port_name ( [expression] )} ); 


The instance_name can be any legal Verilog identifier and the port connections specify how 
the module is connected to the rest of the circuit. The same module can be instantiated 
multiple times in a given design provided that each instance name is unique. The #(param- 
eter overrides) can be used to set the values of parameters defined inside the module_name 
module. We discuss this feature in the next section. Each port_name is the name of a 
port in the subcircuit, and each expression specifies a connection to that port. The syntax 
._port_name is provided so that the order of signals listed in the instantiation statement does 
not have to be the same as the order of the ports given in the module statement of the 
subcircuit. In Verilog jargon, this is called named port connections. If the port connections 
are given in the same order as in the subcircuit, then .port_name is not needed. This format 
is called ordered port connections. 

An example is presented in Figure A.18. It gives the code for a four-bit ripple-carry 
adder built using four instances of the fulladd subcircuit in Figure A.2. The inputs to the 
adder are carryin and two four-bit numbers X and Y. The output is the four-bit sum, S, and 
carryout. A three-bit signal, C, represents the carries from stages 0, 1, and 2. This signal is 
declared as a wire vector with three bits. 

The adder4 module instantiates four copies of the full-adder subcircuit. In the first three 
instantiation statements, we use ordered port connections because the signals are listed in 
the same order as given in the declaration of the fulladd module in Figure A.2. The last 
instantiation statement gives an example of named port connections. The port connections 
used in the instantiation statements specify how the fulladd instances are interconnected by 
nets to create the adder module. 

Figure A.19 gives an example of a hierarchical Verilog file containing two modules. The 
bottom module, seg7, specifies the BCD-to-7-segment code converter shown in Figure 2.63. 


710 


APPENDIX A~ e_ VERILOG REFERENCE 


module adder4 (carryin, X, Y, S, carryout); 
input carryin; 
input [3:0] X, Y; 
output [3:0] S; 
output carryout; 
wire [3:1] C; 


fulladd stageO (carryin, X[0], Y[0], S[O], C[1]); 

fulladd stagel (C[1], X[1], Y(1], S[1], C[2]); 

fulladd stage2 (C[2], X[2], Y[2], S[2], C[3]); 

fulladd stage3 (.Cout(carryout), .s(S[3]), -yCY[3]), -x(X[3]), -Cin(C[3])); 


endmodule 


Figure A.18 — An example of module instantiation. 


It has the four-bit bcd input, which represents a binary-coded-decimal digit, and the seven- 
bit /eds output, which is intended to drive the seven segments a to g on a digit-oriented 
display. Three copies of the seven-segment decoder are instantiated in the top module, 
group. It has a 12-bit input, Digits, and a 21-bit output, Lights, that are connected to the 
three instantiated subcircuits. 


A.12.1 SUBCIRCUIT PARAMETERS 


When a subcircuit includes parameters, their default values can be changed in an instan- 
tiation statement. Figure A.20 gives a module with two eight-bit inputs, X and Y, and a 
four-bit output, C. The module determines the number of bits in X and Y that are identical. 
The code first uses the bitwise XNOR operation to generate a signal T that has Is in the bit 
positions where X and Y have identical values. Then it instantiates the subcircuit specified 
in Figure A.14 to count the number of Is in 7. The syntax #(8,3) overrides the default 
values of the subcircuit parameters in the order given in the code. Since Figure A.14 first 
defines the parameter n, followed by /ogn, then #(8,3) sets n = 8 and /ogn = 3. If only one 
parameter is specified, such as #(8), it overrides the first parameter given in the subcircuit. 
The parameters can also be referred to by name, as in #(.n(8), .logn(3)). 

An alternative syntax for overriding parameter values is shown in Figure A.21. Here, 
the subcircuit parameters are specified by the separate statement 


defparam cbits.n = 8, cbits.logn = 3; 


This statement is not a part of the instantiation statement; hence, it can appear anywhere 
in the code. The intended subcircuit is identified uniquely by its instance name, cbits. 
If the defparam statement is in a separate module from the corresponding instantiation 
statement, the subcircuit module name should be specified in addition to the instance name. 
An example is 


defparam bit_count.cbits.n = 8, bit_count.cbits.logn = 3; 


A.12 USING SUBCIRCUITS 711 


module group (Digits, Lights); 
input [11:0] Digits; 
output [1:21] Lights; 


seg7 digit0 (Digits[3:0], Lights[1:7]); 
seg7 digit! (Digits[7:4], Lights[8:14]); 
seg7 digit2 (Digits[11:8], Lights[15:21]); 


endmodule 


module seg7(bcd, leds); 
input [3:0] bcd; 
output reg [1:7] leds; 


always @(bcd) 

case (bcd) //abcdefg 

0: leds = 7’b1111110; 

: leds = 7’b0110000; 
: leds = 77b1101101; 
: leds = 77b1111001; 
: leds = 7’b0110011; 
: leds = 7’b1011011; 
: leds = 7’b1011111; 
: leds = 7’b1110000; 
: leds =7’b1111111; 

9: leds = 7°b1111011; 

default: leds = 7’ bx; 
endcase 


AADNFWN Ke 


endmodule 


Figure A.19 ~— An example of a hierarchical design file. 


module common (X, Y, C); 
input [7:0] X, Y; 
output [3:0] C; 
wire [7:0] T; 


// Make Ti] = 1 if XG] == Y{iJ 
assign T =X ~’ Y; 


bit_count #(8,3) cbits (T, C); 
endmodule 


Figure A.20 = Overriding module parameters using #. 
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module common (X, Y, C); 
input [7:0] X, Y; 
output [3:0] C; 
wire [7:0] T; 


// Make T[i] = 1 if X[i] == YIil 
assign T =X ~“ Y; 


bit_count cbits (T, C); 
defparam cbits.n = 8, cbits.logn = 3; 


endmodule 


Figure A.21 = Overriding module parameters using the 
defparam statement. 


A.12.2 THE GENERATE CAPABILITY 


Figure A.18 instantiates four copies of the fulladd subcircuit to form a four-bit ripple-carry 
adder. A natural extension of this code is to add a parameter that sets the number of bits 
needed and then use a loop to instantiate the required subcircuits. This can be achieved 
with the generate construct. 

The generate construct has the simplified form 


generate 
[for loops] 
[if-else statements] 
[case statements] 
[instantiation statements] 
endgenerate 


This construct enhances the flexibility of Verilog modules, because it allows instantiation 
statements to be included inside for loops and if-else statements. If a for loop is included 
in the generate block, the loop index variable has to be declared of type genvar. A genvar 
variable is similar to an integer variable, but it can have only positive values and it can be 
used only inside generate blocks. 

Figure A.22 shows the ripple_g module, which instantiates n fulladd modules. Each 
instance generated in the for loop will have a unique instance name produced by the compiler 
based on the for loop label. The generated names are addbit[0].stage, ..., addbit{n — 
1].stage. This code produces the same result as the code in Figure A.13. 

The generate construct can include concurrent statements and procedural statements, 
but its main advantage is in the instantiation of gates and modules inside for loops and 
if-else statements. 


A.13. FUNCTIONS AND TASKS 


module ripple_g (carryin, X, Y, S, carryout); 
parameter n = 4; 
input carryin; 
input [n—1:0] X, Y; 
output [n—1:0] S; 
output carryout; 
wire [n:0] C; 


genvar i; 
assign C[0] = carryin; 
assign carryout = C[n]; 


generate 
for (i = 0; i1<=n-1;i=i+1) 
begin:addbit 
fulladd stage (C[i], X[i], Y(i], Sf], Clit+1]); 
end 
endgenerate 


endmodule 


Figure A.22 = An n-bit ripple-carry adder using the generate 
statement. 
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Figure 4.4 shows how a 16-to-1 multiplexer module can be created by instantiating five 
4-to-1 multiplexers. Another way of modeling this circuit is to use a Verilog function which 
has the general form 


function [range | integer] function_name; 
[input declarations] 
[parameter, reg, integer declarations] 
begin 
statement; 
end 
endfunction 


The purpose of a function is to allow the code to be written in a modular fashion 
without defining separate modules. A function is defined within a module, and it is called 
either in a continuous assignment statement or in a procedural assignment statement inside 
that module. A function can have more than one input, but it does not have an output, 
because the function name itself serves as the output variable. Figure A.23 shows how the 
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module mux_f (W, S16, f); 
input [0:15] W; 
input [3:0] S16; 
output reg f; 
reg [0:3] M; 


function mux4tol; 
input [0:3] W; 
input [1:0] S; 


if (S ==0) mux4tol = W[0]; 

else if (S == 1) mux4tol = W[1]; 

else if (S == 2) mux4tol = W[2]; 

else if (S == 3) mux4tol = W[3]; 
endfunction 


always @(W, S16) 

begin 
M[0] = mux4tol(W[0:3], S16[1:0]); 
M[1] = mux4tol(W[4:7], S16[1:0]); 
M[2] = mux4tol(W[8:11], S16[1:0]); 
M[3] = mux4tol(W[12:15], S16[1:0]); 
f = mux4tol(M[0:3], S16[3:2]); 

end 


endmodule 


Figure A.23) An example of a function. 


16-to-1 multiplexer module can be written using a function. To see how this code works, 
consider the assignment 


f = mux4tol(M[0:3], S16[3:2]); 


The effect of the function call is to have the Verilog compiler replace the assignment 
statement with the function body. The in-line equivalent of the above function call is 


if (S16[3:2] == 0) f= Mf[O]; 

else if (S16[3:2] == 1) f= MII]; 
else if (S16[3:2] == 2) f= MI[2]; 
else if (S16[3:2] == 3) f= MI[3]; 


Similarly, each function call is inserted in-line, with the substitution of appropriate bits 
from S/6, W, and M. 


A.13. FUNCTIONS AND TASKS 


A second example of a function is shown in Figure A.24. The group_ f module is 
equivalent to the group module in Figure A.19. Where the group module instantiates three 
copies of the seg7 subcircuit, group_ f achieves the same effect by using a function. Since 
it returns a seven-bit value, the function is defined with the syntax 


function [1:7] leds; 


Consider again the 16-to-1 multiplexer example in Figure A.23. Another method of 
writing this code appears in Figure A.25. This code uses a Verilog task, which is similar 
to a function. While a function returns a value, a task does not; it has input and output 
variables, like a module. A task can be called only from inside an always (or initial) block. 


module group_f (Digits, Lights); 
input [11:0] Digits; 
output reg [1:21] Lights; 


function [1:7] leds; 
input [3:0] bed; 
begin 
case (bcd) // abcdefg 
0: leds = 77b1111110; 


1: leds = 7’b0110000; 
2: leds = 7761101101; 
3: leds = 7’b1111001; 
4: leds = 7’b0110011; 
5: leds = 77b1011011; 
6: leds = 7’b1011111; 
7: leds = 7’b1110000; 
8: leds = 7’b1111111; 
9: leds = 7’b1111011; 
default: leds = 7’ bx; 
endcase 
end 
endfunction 


always @ (Digits) 

begin 
Lights[1:7] = leds(Digits[3:0]); 
Lights[8:14] = leds(Digits[7:4]); 
Lights[15:21] = leds(Digits[11:8]); 

end 


endmodule 


Figure A.24 — Using a function to implement the group module. 
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module mux_t (W, S16, f); 
input [0:15] W; 
input [3:0] S16; 
output reg f; 
reg [0:3] M; 


task mux4tol; 
input [0:3] W; 
input [1:0] S; 
output Result; 
begin 
if (S ==0) Result = W[0]; 
else if (S == 1) Result = W[1]; 
else if (S == 2) Result = W[2]; 
else if (S == 3) Result = W[3]; 
end 
endtask 


always @(W, S16) 

begin 
mux4tol(W[0:3], S16[1:0], M[0]); 
mux4tol(W[4:7], S16[1:0], M[1]); 
mux4tol(W[8:11], S$16[1:0], M[2]); 
mux4tol(W[12:15], S16[1:0], M[3]); 
mux4tol(M[0:3], $16[3:2], f); 

end 


endmodule 


Figure A.25 = An example of a task. 


In a similar way as described above for the Verilog function, the compiler essentially inserts 
the code for the task body at the point in the code where it is called. 

Functions and tasks are not crucial for designing Verilog code, but they facilitate the 
writing of modular code without using separate modules. One advantage of functions 
and tasks is that they can be called from an always block, whereas these blocks are not 
allowed to contain instantiation statements. These features of Verilog become increasingly 
important as the size of the code being developed increases. 


A.14 SEQUENTIAL CIRCUITS 


While combinational circuits can be modeled with either continuous assignment or pro- 
cedural assignment statements, sequential circuits can be described only with procedural 
statements. We now give some examples of sequential circuits. 


A.14 SEQUENTIAL CIRCUITS 


A.14.1 A GATED D LATCH 


Figure A.26 gives the code for a gated D latch. The sensitivity list for the always block 
includes both the data input, D, and clock, c/k. The if statement specifies that Q should 
be set to the value of D whenever clk is 1. There is no else clause in the if statement. As 
we explained in Section A.11.3, this situation implies that Q should retain its present value 
when the if condition is not met. 


A.14.2 D Fuir-FLop 


Figure A.27 shows how flip-flops are described in Verilog. The always construct uses the 
special sensitivity list @(posedge Clock). This event expression tells the Verilog compiler 
that any reg variable assigned a value in the always construct is the output of a D flip-flop. 
The code in the figure generates a flip-flop with the input D and the output Q that is sensitive 
to the positive clock edge. A negative-edge sensitive flip-flop is specified by @(negedge 
Clock). 

We said in Section A.11.1 that sequential circuits should be described with non-blocking 
assignments, and we used this type of assignment in Figure A.27. The behavior of blocking 
and non-blocking assignments for sequential circuits is discussed in Section A.14.5. 


module latch (D, clk, Q); 
input D, clk; 
output reg Q; 


always @(D, clk) 
if (clk) 
Q=D; 


endmodule 


Figure A.26 ~— A gated D latch. 


module flipflop (D, Clock, Q); 
input D, Clock; 
output reg Q; 


always @(posedge Clock) 
Q<=D; 


endmodule 


Figure A.27 AD flip-flop. 
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A.14.3  FLip-FLOPS WITH RESET 


Figure A.28 gives an always block that is similar to the one in Figure A.27. It describes a D 
flip-flop with an asynchronous reset (clear) input, Resetn. When Resetn = 0, the flip-flop 
output Q is set to 0. Appending the letter n to a signal name is a popular convention to 
denote an active-low signal. 

Verilog syntax requires that a sensitivity list contains either all edge-sensitive events 
or all level-sensitive events but not a mixture; hence, the reset condition is checked using 
negedge Resetn. An active-high reset would require the event posedge Reset and the if-else 
statement would then check for the condition (Reset == 1). 

In general, Verilog offers a variety of ways to describe a given circuit. But, for speci- 
fying flip-flops, the format of the code is quite strict. Only minor variations of the code in 
Figures A.27 and A.28 can be made and still infer the desired flip-flops. For instance, the 
if-else statement could alternatively specify if (!Resetn), but this has to be the first state- 
ment in the always block. Note that nothing is special about the variable name Clock; the 
keyword posedge and the format of the rest of the always block are what allow the Verilog 
compiler to recognize the flip-flop clock signal. 

Figure A.29 shows how a flip-flop with a synchronous reset input can be described. 
Since only the posedge Clock event appears in the sensitivity list of the always block, the 
reset operation has to be synchronized to the clock edge. 


A.14.4 REGISTERS 


One possible approach for describing a multibit register is to create an entity that instantiates 
multiple flip-flops. A more convenient method is illustrated in Figure A.30. It gives the 
same code shown in Figure A.28 but using the four-bit input D and the four-bit output Q. 
The code describes a four-bit register with asynchronous clear. 

Figure A.31 shows how the code in Figure A.30 can be extended to represent an n-bit 
register with an enable input, E. The number of flip-flops is set by the parameter n. When 
the active clock edge occurs, the flip-flops in the register cannot change their stored values 
if the enable F is 0. If E = 1, the register responds to the active clock edge in the normal 
way. 


module flipflop_ar (D, Clock, Resetn, Q); 
input D, Clock, Resetn; 
output reg Q; 


always @(posedge Clock, negedge Resetn) 
if (Resetn == 0) 
Q<=0; 
else 
Q<=D; 


endmodule 


Figure A.28 AD flip-flop with asynchronous reset. 
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module flipflop_sr (D, Clock, Resetn, Q); 
input D, Clock, Resetn; 
output reg Q; 


always @(posedge Clock) 
if (Resetn == 0) 
Q<=0; 
else 
Q<=D; 


endmodule 


Figure A.29 = A.D flip-flop with synchronous reset. 


module reg4 (D, Clock, Resetn, Q); 
input [3:0] D; 
input Clock, Resetn; 
output reg [3:0] Q; 


always @(posedge Clock, negedge Resetn) 
if (Resetn == 0) 

Q <= 4’b0000; 
else 

Q<=D; 


endmodule 


Figure A.30 A four-bit register with asynchronous clear. 


module regne (D, Clock, Resetn, E, Q); 
parameter n = 4; 
input [n—1:0] D; 
input Clock, Resetn, E; 
output reg [n—1:0] Q; 


always @(posedge Clock, negedge Resetn) 
if (Resetn == 0) 
Q<=0; 
else if (E) 
Q<=D; 


endmodule 


Figure A.31) = Ann-bit register with asynchronous clear and enable. 
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A.14.5 SHIFT REGISTERS 


An example of code that defines a three-bit shift register is provided in Figure A.32. The 
lines of code are numbered for ease of reference. The shift register has a serial input, w, and 
parallel outputs, Q. The right-most bit in the register is Q[3], and the left-most bit is Q[1]. 
Shifting is performed in the right-to-left direction. All assignments to Q are synchronized 
to the clock edge by the (posedge Clock) event, hence Q represents the outputs of flip-flops. 
The statement in line 6 specifies that Q[3] is assigned the value of w. The semantics of the 
non-blocking assignments mean that the subsequent statements do not see the new value 
of Q[3] until the next time the always block is evaluated (in the following clock cycle). In 
line 7, the current value of Q[3], before it is shifted as a result of line 6, is assigned to Q[2]. 
Line 8 completes the shift operation by assigning the current value of Q[2] to Q[1], before 
it is changed as a result of line 7. 

The key point that has to be appreciated in the code in Figure A.32 is that the assignments 
in lines 6 to 8 do not take effect until the end of the always block. Hence, all flip-flops 
change their values at the same time, as required in the shift register. We could write the 
statements in lines 6 to 8 in any order without changing the meaning of the code. 


Blocking Assignments for Sequential Circuits 

We said previously that blocking assignments should not be used for sequential circuits. 
As an example of the semantics involved, Figure A.33 gives the always block from Figure 
A.32 with blocking assignments. The first assignment sets Q[3] = w. Since blocking 
assignments are involved, the next statement sees this new value of Q[3]; hence, it produces 
Q[2] = Q[3] = w. Similarly, the final assignment gives Q[1] = Q[2] = w. The code does 
not describe the desired shift register, but rather loads all flip-flops with the value on the 
input w. 

For the code in Figure A.33 to correctly describe a shift register, the ordering of the 
three assignments has to be reversed. Then the first assignment sets Q[1] to the value of 
Q[2], the second sets Q[2] to the value of Q[3], and so on. Each successive assignment is 
not affected by the one that precedes it; hence, the semantics of blocking assignments cause 
no problem. 


1 module shift3 (w, Clock, Q); 
2 input w, Clock; 
3 output reg [1:3] Q; 


always @(posedge Clock) 
begin 

Q[3] <= w; 

QI2] <= Q{3]; 

QU1] <= Q(2]; 


end 


OANDHDMNF 


10 endmodule 


Figure A.32 A three-bit shift register. 
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module shift3 (w, Clock, Q); 
input w, Clock; 
output reg [1:3] Q; 


always @(posedge Clock) 


begin 
QI3] = w; 
Ql2] = QI3); 
QU] = Q(2); 
end 
endmodule 


Figure A.33 = Wrong code for a three-bit shift register. 


module count4 (Clock, Resetn, E, Q); 
input Clock, Resetn, E; 
output reg [3:0] Q; 


always @(posedge Clock, negedge Resetn) 
if (Resetn == 0) 
Q<=0; 
else if (E) 
Q<=Q+1; 


endmodule 


Figure A.34 — Code for a four-bit counter. 


To avoid the confusing dependence on the ordering of statements, blocking assignments 
should be avoided when modeling sequential circuits. Also, because they imply differing 
semantics, blocking and non-blocking assignments should never be mixed in a single always 
construct. 


A.14.6 COUNTERS 


Figure A.34 presents the code for a four-bit counter with an asynchronous reset input. The 
counter also has an enable input, E. On the positive clock edge, if EF is 1, the count is 
incremented. If E = O, the counter holds its current value. 
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Resetn Y Clock xX 


— 


L  Down-Counter Register 


Result 


Figure A.35 ~The accumulator circuit. 


A.14.7 AN EXAMPLE OF A SEQUENTIAL CIRCUIT 


An example of a sequential circuit is given in Figure A.35. It adds together the values of the 
k-bit input X over successive clock cycles, and stores the sum of these values into a k-bit 
register. Such a circuit is often called an accumulator. To store the result of each addition 
operation, the circuit includes a k-bit register with an asynchronous reset input, Resetn. It 
also has an enable input, E, which is controlled by a down-counter. The down-counter has 
an asynchronous load input and a count enable input. The circuit is operated by first setting 
Resetn to 0, which resets the contents of the k-bit register to 0 and loads the down-counter 
with an m-bit number on the Y input. 

Then, in each clock cycle, the counter is decremented, and the sum outputs from the 
adder are loaded into the register. When the counter reaches 0, the enable inputs on both 
the register and counter are set to 0 by the OR gate. The circuit remains in this state until 
it is reset again. The final value stored in the register is the sum of the values of X in each 
of the Y clock cycles. 

We designed the accumulator circuit by using two subcircuits described in this ap- 
pendix: ripple (Figure A.13) and regne (Figure A.31). The complete code is given in 
Figure A.36. It uses the parameter k to set the number of bits in the input X , and parameter 
m to specify the number of bits in the counter. Using these parameters in the code makes 
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module accum (X, Y, Clock, Resetn, Result); 
parameter k = 8; 
parameter m = 4; 
input [k—1:0] X; 
input [m-—1:0] Y; 
input Clock, Resetn; 
output [k—1:0] Result; 
wire [k—1:0] Sum; 
wire Cout, Go; 
reg [m—1:0] C; 


ripple ul (.carryin(0), .X(X), .Y(Result), .S(Sum), .carryout(Cout)); 
defparam ul.n=k; 

regne u2 (.D(Sum), .Clock(Clock), .Resetn(Resetn), .E(Go), .Q(Result)); 
defparam u?2.n = 8; 


always @(posedge Clock, negedge Resetn) 
if (Resetn == 0) 
C<=Y; 
else if (Go) 
C<=C-1; 


assign Go = | C; 
endmodule 


Figure A.36 = Code for a k-bit accumulator circuit. 


it easy to change the bit width at a later time if desired. The code defines the signal Sum 
to represent the outputs of the adder; for simplicity, we ignore the possibility of arithmetic 
overflow and assume that the sum will fit into k bits. The m-bit signal C represents the 
outputs from the down-counter. The Go signal is connected to the enable inputs on the 
register and counter. 


A.14.8 Moore-TyPe FINITE STATE MACHINES 


Figure A.37 gives the state diagram of a simple Moore machine. Verilog code for this 
machine is shown in Figure A.38. The two-bit vector y represents the present state of the 
machine, and the state codes are defined as parameters. Some CAD synthesis systems 
provide a means of requesting that the state assignment be chosen automatically, but we 
have specified the assignment manually in this example. 

The present state signal y corresponds to the outputs of the state flip-flops, and the 
signal Y represents the inputs to the flip-flops, which define the next state. The code has two 
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Reset 


w=l 


Figure A.37 — State diagram of a simple Moore-type FSM. 


always blocks. The top one describes a combinational circuit and uses a case statement to 
specify the values that Y should have for each value of y. The other always block represents 
a sequential circuit, which specifies that y is assigned the value of Y on the positive clock 
edge. The always block also specifies that y should take the value A when Resetn is 0, 
which provides the asynchronous reset. 

Since the machine is of the Moore type, the output z can be defined using the assignment 
statement z = (y == C) that depends only on the present state of the machine. This statement 
is provided as a continuous assignment at the end of the code, but it could alternatively have 
been given inside the top always block that represents the combinational part of the FSM. 
This assignment statement cannot be placed inside the bottom always block. Doing so 
would cause z to be the output of a separate flip-flip, rather than a combinational function 
of y. This circuit would set z to 1 one clock cycle later than required when the machine 
enters state C. 

An alternative version of the code for the Moore machine is given in Figure A.39. This 
code uses a single always block to define both the combinational and sequential parts of 
the finite state machine. In practice, the code in Figure A.38 is used more commonly. 


A.14.9 MEALY-TYPE FINITE STATE MACHINES 


A state diagram for a simple Mealy machine is shown in Figure A.40, and the corresponding 
code is given in Figure A.41. The code has the same structure as in Figure A.38 except 
that the output z is defined within the top always block. The case statement specifies that, 
when the FSM is in state A, z should be 0, but when in state B, z should take the value of 
w. Since the top always block represents a combinational circuit, the output z can change 
value as soon as the input w changes, as required for the Mealy machine. 
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module moore (Clock, w, Resetn, z); 
input Clock, w, Resetn; 
output z; 
reg [1:0] y, Y; 
parameter A = 2’b00, B = 2’b01, C = 2’b10; 


always @(w, y) 


begin 
case (y) 
A: if(w==0) Y=A; 
else Y=B; 
B: if(w==0) Y=A; 
else Y=C: 
C: if(w==0) Y=A; 
else Y=C: 
default: Y = 2’bxx; 
endcase 
end 


always @(posedge Clock, negedge Resetn) 
begin 
if (Resetn == 0) 
y <=A; 
else 
y <=Y; 
end 


assign z = (y == C); 
endmodule 


Figure A.38 — Code for a Moore machine. 


A.15 GUIDELINES FOR WRITING VERILOG CODE 


Modern digital systems are large and complex. A good approach in designing such systems 
is to decompose them into smaller, manageable parts. Each part can then be designed by 
making use of the popular subcircuit building blocks that we describe in this book. 

When using Verilog for synthesis, it is essential to write code such that the compiler 
will generate the intended circuit. For example, code for combinational circuits like adders, 
multiplexers, encoders, and decoders should be written as illustrated in Chapters 3 and 4. 
Flip-flops, registers, and counters should be described with the style of code presented 
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module moore (Clock, w, Resetn, z); 
input Clock, w, Resetn; 
output Zz; 
reg [1:0] y; 
parameter A = 2’b00, B = 2’b01, C = 2’b10; 


always @(posedge Clock, negedge Resetn) 


begin 
if (Resetn == 0) 
y <=A; 
else 
case (y) 
A: if(w==0) y<=A; 
else y <=B; 
B: if(w==0) y<=A; 
else y <=C; 
Cc: if(w==0) y<=A; 
else y<=C; 
default: y <= 2’bxx; 
endcase 
end 


assign z = (y == C); 
endmodule 


Figure A.39 = Alternative version of the code for a Moore machine. 
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Figure A.40 State diagram of a Mealy-type FSM. 


in Chapter 5; and finite-state machine code should be expressed in the manner shown in 
Chapter 6. Chapter 7 gives some examples of how larger circuits can be constructed by 
employing these common building blocks. This approach of writing HDL code such that the 
system is built by interconnecting common, relatively simple subcircuits is often referred 
to as the register-transfer level (RTL) style of code. It is the most popular design method 
used in practice. The rest of this section lists some common errors found in Verilog code 
and gives some useful guidelines. 
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module mealy (Clock, w, Resetn, z); 
input Clock, w, Resetn; 
output reg z; 
reg y, Y; 
parameter A= 1’b0, B = 1’b1; 


always @(w, y) 
case (y) 
A: if (w == 0) 

begin 
Y=A; 
z=0; 

end 

else 

begin 
Y=B; 
z=0; 

end 

B: if (w == 0) 

begin 
Y=A; 
z=0; 

end 

else 

begin 
Y=B; 
z=1; 

end 

endcase 


always @(posedge Clock, negedge Resetn) 
if (Resetn == 0) 
y <=A; 
else 
y <=Y; 


endmodule 


Figure A.41 = Code for a Mealy machine. 
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Missing Begin-End 
The always construct requires begin and end delimiters if there are multiple statements 


in the block. The compiler does not take indentation into consideration! For example the 
always block 


always @(w0, wl, s) 
if(s==1) 
f=wl; 
f= w0; 


has one statement in it, not two. 


Missing Semicolon 

Every Verilog statement must end with a semicolon. 

Missing { } 

The replication operator requires a lot of brackets. A common mistake is to write 
{3{A}, 2{B}} instead of {{3{A}}, {2{B}}}. 


Accidental Assignment 
The statement 


always @(w0, wl, s) 


begin 
if(s=1) 
f=wl; 
f= w0; 
end 


does not test the value of s. It sets s to the value 1. 


Incomplete Sensitivity List 
Consider the always block 


always @(x) 


begin 
s=x’ y; 
c=x&y; 
end 


When using Verilog for synthesis, this code produces the desired circuit, which is a half- 
adder. However, if this code is used for simulation of circuits, the values of s and c are 
updated only as a result of changes in the value of x. Changes in y will have no effect, 
because y is not included in the sensitivity list. To avoid mismatches between simulation 
results and synthesized circuits, always blocks for combinational circuits should include 
all signals used on the right-hand side of assignments in the block. 
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Variables versus Nets 

Only nets can serve as the targets of continuous assignment statements. Variables 
assigned values inside an always block have to be of type reg or integer. It is not possible to 
make an assignment to a signal both inside an always block and via a continuous assignment 
statement. 


Assignments in Multiple always Blocks 


In a module that has multiple always blocks, all the always blocks are concurrent with 
respect to one another. Therefore, a given variable should never be assigned a value in 
more than one always block. Doing so would mean that there exist multiple concurrent 
assignments to this variable, which makes no sense. 


Blocking versus Non-blocking Assignments 


When describing a combinational circuit in an always construct, it is best to use only 
blocking assignments (see Section A.11.7). For sequential circuits, non-blocking assign- 
ments should be used (see Section A.14.5). Blocking and non-blocking assignments should 
not be mixed in a single always block. 

It is not possible to model both a combinational output and a sequential output in a 
single always block. The sequential output requires an edge-triggered event control, such 
as @(posedge Clock), and this means that all variables assigned a value in the always block 
will be implemented as the outputs of flip-flops. 


Module Instantiation 


A defparam statement must reference the instance name of a module, but not just the 
subcircuit’s module name. The code 


bit_count cbits (T, C); 
defparam bit_count.n = 8, bit_count.logn = 3; 


is illegal, while the code 


bit_count cbits (T, C); 
defparam cbits.n = 8, cbits.logn = 3; 


is syntactically correct. 


Label, Net, and Variable Names 


It is illegal to use any Verilog keyword as a label, net, or variable name. For example, 
it is illegal to name a signal input or output. 


Labeled Begin-End Blocks 


It is legal to define a variable or parameter inside a begin-end block, but only if the 
block has a label. The code 
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is illegal, while the code 


is syntactically correct. 


Implied Memory 
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always @(X) 
begin 
integer k; 
Count = 0; 
for (k =0;k <n; k=k+1) 
Count = Count + X[k]; 
end 


always @(X) 
begin: label 
integer k; 
Count = 0; 
for (k = 0;k <n; k=k+1) 
Count = Count + X[k]; 
end 


As shown in Section A.14.1, implied memory is used to describe storage elements. 
Care must be taken to avoid unintentional implied memory. The code 


always @(LA) 
if (LA == 1) 
EA=1; 


results in implied memory for the EA variable. If this is not desired, then the code can be 


fixed by writing 


always @(LA) 


if (LA == 1) 
EA=1; 
else 
EA = 0; 


Implied memory also applies to case statements. The code 


always @(W) 


case (W) 
2’°b01: EA=1; 
2’b10: EB =1; 


endcase 


REFERENCES 


does not specify the value of the EA variable when W is not equal to 01, and it does not 
specify the value of EB when W is not equal to 10. To avoid having implied memory for 
both EA and EB, these variables should be assigned default values, as in the code 


always @(W) 


begin 
EA = 0; EB = 0; 
case (W) 
2°b01: EA=1; 
2°b10: EB =1; 
endcase 
end 
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A.16 CONCLUDING REMARKS 


This appendix describes the important features of Verilog that are useful for the synthesis of 
logic circuits. As mentioned earlier, we do not discuss many features of Verilog which are 
useful only for simulation of circuits or for other purposes. A reader who wishes to learn 
more about using Verilog can refer to specialized books [1-7]. 
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IMPLEMENTATION TECHNOLOGY 


This appendix provides a detailed discussion of integrated circuit technology. We show 
how transistors operate as simple switches, and how logic gates are built using transistors. 
We also describe the structure of integrated circuit chips that are used to implement logic 
circuits, including programmable logic devices, gate arrays, and memory chips. Finally, we 
study some fundamental properties of electronic circuits, including current flow and power 
dissipation. 

Let us first consider how logic variables can be physically represented as signals in 
electronic circuits. Our discussion will be restricted to binary variables, which can take 
on only the values 0 and 1. In a circuit these values can be represented either as levels of 
voltage or current. Both alternatives are used in different technologies. We will focus on 
the simplest and most popular representation, using voltage levels. 

The most obvious way of representing two logic values as voltage levels is to define a 
threshold voltage; any voltage below the threshold represents one logic value, and voltages 
above the threshold correspond to the other logic value. It is an arbitrary choice as to 
which logic value is associated with the low and high voltage levels. Usually, logic 0 is 
represented by the low voltage levels and logic | by the high voltages. This is known as 
a positive logic system. The opposite choice, in which the low voltage levels are used to 
represent logic | and the higher voltages are used for logic 0 is known as a negative logic 
system. In this book we use only the positive logic system, but negative logic is discussed 
briefly in Section B.4. 

Using the positive logic system, the logic values 0 and | are referred to simply as “low” 
and “high.” To implement the threshold-voltage concept, a range of low and high voltage 
levels is defined, as shown in Figure B.1. The figure gives the minimum voltage, called Vss, 
and the maximum voltage, called Vpp, that can exist in the circuit. We will assume that Vss5 
is 0 volts, corresponding to electrical ground, denoted Gnd. The voltage Vpp represents the 
power supply voltage. The most common levels for Vpp are between 5 volts and | volt. In 
this appendix we will mostly use the value Vpp = 5 V. Figure B.1 indicates that voltages 
in the range Gnd to Vo, nay represent logic value 0. The name Vo,,,ax means the maximum 
voltage level that a logic circuit must recognize as low. Similarly, the range from V) nin 
to Vpp corresponds to logic value 1, and Vj jn is the minimum voltage level that a logic 
circuit must interpret as high. The exact levels of Vo may and V1, min depend on the particular 
technology used; a typical example might set Vo max to 40 percent of Vpp and Vj min to 60 
percent of Vpp. The range of voltages between Vo max and V1 nin is undefined. Logic signals 
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Voltage 
Vop 
Logic value 1 
Vi min 
Undefined 
Vo,max 
Logic value 0 
Vss (Gnd) 
Figure B.1_ —_ Representation of logic values by voltage levels. 


do not normally assume voltages in this range except in transition from one logic value to 
the other. We will discuss the voltage levels used in logic circuits in more depth in Section 
B.8.3. 


B.1 TRANSISTOR SWITCHES 


Logic circuits are built with transistors. A full treatment of transistor behavior is beyond 
the scope of this text; it can be found in electronics textbooks, such as [1] and [2]. For 
the purpose of understanding how logic circuits are built, we can assume that a transistor 
operates as a simple switch. Figure B.2a shows a switch controlled by a logic signal, 
x. When x is low, the switch is open, and when x is high, the switch is closed. The most 
popular type of transistor for implementing a simple switch is the metal oxide semiconductor 
field-effect transistor (MOSFET). There are two different types of MOSFETs, known as n- 
channel, abbreviated NMOS, and p-channel, denoted PMOS. 

Figure B.2b gives a graphical symbol for an NMOS transistor. It has four electrical 
terminals, called the source, drain, gate, and substrate. In logic circuits the substrate (also 
called body) terminal is connected to Gnd. We will use the simplified graphical symbol in 
Figure B.2c, which omits the substrate node. There is no physical difference between the 
source and drain terminals. They are distinguished in practice by the voltage levels applied 
to the transistor; by convention, the terminal with the lower voltage level is deemed to be 
the source. 
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(c) Simplified symbol for an NMOS transistor 


Figure B.2,) = NMOS transistor as a switch. 


A detailed explanation of how the transistor operates will be presented in Section B.8.1. 
For now it is sufficient to know that it is controlled by the voltage Vg at the gate terminal. 
If Vg is low, then there is no connection between the source and drain, and we say that the 
transistor is turned off. If Vg is high, then the transistor is turned on and acts as a closed 
switch that connects the source and drain terminals. In Section B.8.2 we show how to 
calculate the resistance between the source and drain terminals when the transistor is turned 
on, but for now assume that the resistance is 0 Q. 

PMOS transistors have the opposite behavior of NMOS transistors. The former are 
used to realize the type of switch illustrated in Figure B.3a, where the switch is open when 
the control input x is high and closed when x is low. A symbol is shown in Figure B.3b. 
In logic circuits the substrate of the PMOS transistor is connected to Vpp, leading to the 
simplified symbol in Figure B.3c. If Vg is high, then the PMOS transistor is turned off and 
acts like an open switch. When Vg is low, the transistor is turned on and acts as a closed 
switch that connects the source and drain. In the PMOS transistor the source is the node 
with the higher voltage. 

Figure B.4 summarizes the typical use of NMOS and PMOS transistors in logic circuits. 
An NMOS transistor is turned on when its gate terminal is high, while a PMOS transistor 
is turned on when its gate is low. When the NMOS transistor is turned on, its drain is 
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Figure B.3 PMOS transistor as a switch. 


pulled down to Gnd, and when the PMOS transistor is turned on, its drain is pulled up to 
Vpp. Because of the way the transistors operate, an NMOS transistor cannot be used to 
pull its drain terminal completely up to Vpp. Similarly, a PMOS transistor cannot be used 
to pull its drain terminal completely down to Gnd. We discuss the operation of MOSFETs 
in considerable detail in Section B.8. 


B.2 NMOS Loecic GATES 


The first schemes for building logic gates with MOSFETs became popular in the 1970s 
and relied on either PMOS or NMOS transistors, but not both. Since the early 1980s, a 
combination of both NMOS and PMOS transistors has been used. We will first describe 
how logic circuits can be built using NMOS transistors because these circuits are easier 
to understand. Such circuits are known as NMOS circuits. Then we will show how 
NMOS and PMOS transistors are combined in the presently popular technology known as 
complementary MOS, or CMOS. 

In the circuit in Figure B.S5a, when V, = 0 V the NMOS transistor is turned off. No 
current flows through the resistor R, and Vy = 5 V. On the other hand, when V, = 5 V, the 
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Figure B.4 = NMOS and PMOS transistors in logic circuits. 


transistor is turned on and pulls Vy to a low voltage level. The exact voltage level of V¢ 
in this case depends on the amount of current that flows through the resistor and transistor. 
Typically, Vy is about 0.2 V (see Section B.8.3). If Vy is viewed as a function of V,, then the 
circuit is an NMOS implementation of a NOT gate. In logic terms this circuit implements 
the function f = x. Figure B.5b gives a simplified circuit diagram in which the connection 
to the positive terminal on the power supply is indicated by an arrow labeled Vpp and the 
connection to the negative power-supply terminal is indicated by the Gnd symbol. We will 
use this simplified style of circuit diagram for the figures that follow. 

The purpose of the resistor in the NOT gate circuit is to limit the amount of current that 
flows when V,. = 5 V. Rather than using a resistor for this purpose, a transistor is normally 
used. We will discuss this issue in more detail in Section B.8.3. In subsequent diagrams 
a dashed box is drawn around the resistor R as a reminder that it is implemented using a 
transistor. 

Figure B.5c presents the graphical symbols for a NOT gate. The symbol on the left 
shows the input, output, power, and ground terminals, and the symbol on the right is 
simplified to show only the input and output terminals. In practice only the simplified 
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(a) Circuit diagram (b) Simplified circuit diagram 


(c) Graphical symbols 


Figure B.5 = A NOT gate built using NMOS technology. 


symbol is used. Another name often used for the NOT gate is inverter. We use both names 
interchangeably in this book. 

In Section 2.1 we saw that a series connection of switches corresponds to the logic AND 
function, while a parallel connection represents the OR function. Using NMOS transistors, 
we can implement the series connection as depicted in Figure B.6a. If V,, = Vx, =5 V, 
both transistors will be on and V; will be close to 0 V. But if either V,, or V,, is 0, then no 
current will flow through the series-connected transistors and V; will be pulled up to 5 V. 
The resulting truth table for f, provided in terms of logic values, is given in Figure B.6b. 
The circuit realizes a NAND gate. Its graphical symbols are shown in part (c) of the figure. 

The parallel connection of NMOS transistors is given in Figure B.7a. Here, if either 
V,, =5 Vor V,, = 5 V, then Vy will be close to 0 V. Only if both V,, and V,, are 0 will Vy 
be pulled up to 5 V. A corresponding truth table is given in Figure B.7b, which shows that 
the circuit realizes the NOR function. The graphical symbols for the NOR gate appear in 
part (c) of the figure. 

Figure B.8 indicates how an AND gate is built in NMOS technology by following 
a NAND gate with an inverter. Node A realizes the NAND of inputs x; and x, and f 
represents the AND function. In a similar fashion an OR gate is realized as a NOR gate 
followed by an inverter, as depicted in Figure B.9. 
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Figure B.6 = NMOS realization of a NAND gate. 
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So far we have considered the implementation of logic gates using NMOS transistors. For 
each of the circuits that has been presented, it is possible to derive an equivalent circuit 
that uses PMOS transistors. However, it is more interesting to consider how both NMOS 
and PMOS transistors can be used together. The most popular such approach is known as 
CMOS technology. We will see in Section B.8 that CMOS technology offers some attractive 
practical advantages in comparison to NMOS technology. 

In NMOS circuits the logic functions are realized by arrangements of NMOS transistors, 
combined with a pull-up device that acts as a resistor. We will refer to the part of the circuit 
that involves NMOS transistors as the pull-down network (PDN). Then the structure of the 
circuits in Figures B.5 through B.9 can be characterized by the block diagram in Figure 
B.10. The concept of CMOS circuits is based on replacing the pull-up device with a pull-up 
network (PUN) that is built using PMOS transistors, such that the functions realized by the 
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Figure B.7 = NMOS realization of a NOR gate. 


PDN and PUN networks are complements of each other. Then a logic circuit, such as a 
typical logic gate, is implemented as indicated in Figure B.11. For any given valuation of 
the input signals, either the PDN pulls Vy down to Gnd or the PUN pulls Vy up to Vpp. The 
PDN and the PUN have equal numbers of transistors, which are arranged so that the two 
networks are duals of one another. Wherever the PDN has NMOS transistors in series, the 
PUN has PMOS transistors in parallel, and vice versa. 

The simplest example of a CMOS circuit, a NOT gate, is shown in Figure B.12. When 
V, = 0 V, transistor T> is off and transistor T; is on. This makes Vy = 5 V, and since 7} is 
off, no current flows through the transistors. When V, = 5 V, T> is on and T, is off. Thus 
V; = 0 V, and no current flows because T; is off. 

A key point is that no current flows in a CMOS inverter when the input is either low or 
high. This is true for all CMOS circuits; no current flows, and hence no power is dissipated 
under steady state conditions. This property has led to CMOS becoming the most popular 
technology in use today for building logic circuits. We will discuss current flow and power 
dissipation in detail in Section B.8.6. 

Figure B.13 provides a circuit diagram of a CMOS NAND gate. It is similar to the 
NMOS circuit presented in Figure B.6 except that the pull-up device has been replaced by 
the PUN with two PMOS transistors connected in parallel. The truth table in the figure 
specifies the state of each of the four transistors for each logic valuation of inputs x; and 
X2. The reader can verify that the circuit properly implements the NAND function. Under 
static conditions no path exists for current flow from Vpp to Gnd. 
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Figure B.8 = NMOS realization of an AND gate. 


The circuit in Figure B.13 can be derived from the logic expression that defines the 
NAND operation, f = x;x2. This expression specifies the conditions for which f = 1; 
hence it defines the PUN. Since the PUN consists of PMOS transistors, which are turned 
on when their control (gate) inputs are set to 0, an input variable x; turns on a transistor if 
x; = 0. From DeMorgan’s law, we have 


f=%m=%,+% 
Thus f = 1 when either input x, or x2 has the value 0, which means that the PUN must have 
two PMOS transistors connected in parallel. The PDN must implement the complement of 
Ff, which is 
f =xx 


Since fa = | when both x, and x, are 1, it follows that the PDN must have two NMOS 
transistors connected in series. 
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Figure B.9 = NMOS realization of an OR gate. 
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Figure B.10 — Structure of an NMOS circuit. 
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Figure B.11 = Structure of a CMOS circuit. 


(a) Circuit (b) Truth table and transistor states 


Figure B.12 = CMOS realization of a NOT gate. 


The circuit fora CMOS NOR gate is derived from the logic expression that defines the 
NOR operation 


f =x + x2 = X1X2 
Since f = 1 only if both x; and x2 have the value 0, then the PUN consists of two PMOS 


transistors connected in series. The PDN, which realizes f = x; + x2, has two NMOS 
transistors in parallel, leading to the circuit shown in Figure B.14. 
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Figure B13. = CMOS realization of a NAND gate. 
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Figure B.14 = CMOS realization of a NOR gate. 
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Figure B.15 = CMOS realization of an AND gate. 


A CMOS AND gate is built by connecting a NAND gate to an inverter, as illustrated 
in Figure B.15. Similarly, an OR gate is constructed with a NOR gate followed by a NOT 
gate. 

The above procedure for deriving a CMOS circuit can be applied to more general logic 
functions to create complex gates. This process is illustrated in the following two examples. 
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Consider the function 


f =X +X2x3 


Since all variables appear in their complemented form, we can directly derive the PUN. 
It consists of a PMOS transistor controlled by x; in parallel with a series combination of 
PMOS transistors controlled by x2 and x3. For the PDN we have 


f =X, + 0x3 = x1 (x2 + x3) 


This expression gives the PDN that has an NMOS transistor controlled by x; in series with 
a parallel combination of NMOS transistors controlled by x2 and x3. The circuit is shown 
in Figure B.16. 


Example B.1 
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Vpp 


Figure B.16 The circuit for Example B.1. 


Example B.2 Consider the function 
f =X + H+ X3)X4 
Then 
Ff = x1 (@ox3 + x4) 


These expressions lead directly to the circuit in Figure B.17. 


The circuits in Figures B.16 and B.17 show that it is possible to implement fairly 
complex logic functions using combinations of series and parallel connections of transistors 
(acting as switches), without implementing each series or parallel connection as a complete 
AND (using the structure introduced in Figure B.15) or OR gate. 


B.3.1 SPEED OF LOGIC GATE CIRCUITS 


In the preceding sections we have assumed that transistors operate as ideal switches that 
present no resistance to current flow. Hence, while we have derived circuits that realize 
the functionality needed in logic gates, we have ignored the important issue of the speed of 
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Figure B.17 ~The circuit for Example B.2. 


operation of the circuits. In reality transistor switches have a significant resistance when 
turned on. Also, transistor circuits include capacitors, which are created as a side effect 
of the manufacturing process. These factors affect the amount of time required for signal 
values to propagate through logic gates. We provide a detailed discussion of the speed of 
logic circuits, as well as a number of other practical issues, in Section B.8. 
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B.4 NEGATIVE LOGIC SYSTEM 


In the discussion for Figure B.1, we said that logic values are represented as two distinct 
ranges of voltage levels. We are using the convention that the higher voltage levels represent 
logic value | and the lower voltages represent logic value 0. This convention is known 
as the positive logic system, and it is the one used in most practical applications. In this 
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Figure B.18 —_ Voltage levels in the circuit in Figure B.13. 


section we briefly consider the negative logic system in which the association between 
voltage levels and logic values is reversed. 

Let us reconsider the CMOS circuit in Figure B.13, which is reproduced in Figure 
B.18a. Part (b) of the figure gives a truth table for the circuit, but the table shows voltage 
levels instead of logic values. In this table, L refers to the low voltage level in the circuit, 
which is 0 V, and H represents the high voltage level, which is Vpp. This is the style of 
truth table that manufacturers of integrated circuits often use in data sheets to describe the 
functionality of the chips. It is entirely up to the user of the chip as to whether L and H are 
interpreted in terms of logic values such that L = 0 and H = 1,orL=1andH =0. 

Figure B.19a illustrates the positive logic interpretation in which L = 0 and H = 1. 
As we already know from the discussions of Figure B.13, the circuit represents a NAND 
gate under this interpretation. The opposite interpretation is shown in Figure B.19b. Here 
negative logic is used so that L = 1 and H = 0. The truth table specifies that the circuit 
represents a NOR gate in this case. Note that the truth table rows are listed in the opposite 
order from what we normally use, to be consistent with the L and H values in Figure B.18D. 
Figure B.19b also gives the logic gate symbol for the NOR gate, which includes small 
triangles on the gate’s terminals to indicate that the negative logic system is used. 

As another example, consider again the circuit in Figure B.15. Its truth table, in terms 
of voltage levels, is given in Figure B.20a. Using the positive logic system, this circuit 
represents an AND gate, as indicated in Figure B.20b. But using the negative logic system, 
the circuit represents an OR gate, as depicted in Figure B.20c. 

It is possible to use a mixture of positive and negative logic in a single circuit, which 
is known as a mixed logic system. In practice, the positive logic system is used in most 
applications. We do not consider the negative logic system in the main body of the book. 
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(b) Negative logic truth table and gate symbol 


Figure B.19 _ Interpretation of the circuit in Figure B.18. 
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In Chapter | we mentioned that several different types of integrated circuit chips are avail- 
able for implementation of logic circuits. We now discuss the available choices in some 
detail. 


B.5.1 7400-SERIES STANDARD CHIPS 


An approach used widely until the mid-1980s was to connect together multiple chips, each 
containing only a few logic gates. A wide assortment of chips, with different types of logic 
gates, is available for this purpose. They are known as 7400-series parts because the chip 
part numbers always begin with the digits 74. An example of a 7400-series part is given 
in Figure B.21. Part (a) of the figure shows a type of package that the chip is provided in, 
called a dual-inline package (DIP). Part (b) illustrates the 7404 chip, which comprises six 
NOT gates. The chip’s external connections are called pins or leads. Two pins are used 
to connect to Vpp and Gnd, and other pins provide connections to the NOT gates. Many 
7400-series chips exist, and they are described in the data books produced by manufacturers 
of these chips [3-7]. Diagrams of some of the chips are also included in several textbooks, 
such as [8-12]. 
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Figure B.20 _Interpretation of the circuit in Figure B.15. 


The 7400-series chips are produced in standard forms by a number of integrated circuit 
manufacturers, using agreed-upon specifications. Competition among various manufactur- 
ers works to the designer’s advantage because it tends to lower the price of chips and ensures 
that parts are always readily available. For each specific 7400-series chip, several variants 
are built with different technologies. For instance, the part called 74LS00 is built with a 
technology called transistor-transistor logic (TTL), which is described in [1], whereas the 
74HCO00 is fabricated using CMOS technology. In general, the most popular chips used 
today are the CMOS variants. 

As an example of how a logic circuit can be implemented using 7400-series chips, 
consider the function f = x1x2 + X2x3. A NOT gate is required to produce x2, as well as 2 
two-input AND gates and a two-input OR gate. Figure B.22 shows three 7400-series chips 
that can be used to implement the function. We assume that the three input signals x), x2, 
and x3 are produced as outputs by some other circuitry that can be connected by wires to 
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(a) Dual-inline package 


(b) Structure of 7404 chip 


Figure B.21_) =A 7400-series chip. 


the three chips. Notice that power and ground connections are included for all three chips. 
This example makes use of only a portion of the gates available in the three chips, hence 
the remaining gates can be used to realize other functions. 

Because of their low logic capacity, the standard chips are seldom used in practice today, 
with one exception. Many modern products include standard chips that contain drivers. 
Drivers (also sometimes called buffers) are logic gates that are usually used to improve the 
speed of circuits. An example of a driver chip is depicted in Figure B.23. It is the 74244 
chip, which comprises eight tri-state drivers. We describe how tri-state drivers work in 
Section 7.1.1. Rather than showing how the drivers are arranged inside the chip package, 
as we did for the NOT gates in Figure B.21, we show only the pin numbers of the package 
pins that are connected to the drivers. The package has 20 pins, and they are numbered 
with pin | in the lower-left corner, and pin 20 in the upper-left corner. The Gnd and Vpp 
connections are provided on pins 10 and 20, respectively, similar to the arrangement shown 
in Figure B.21. Many other driver chips also exist. For example, the 162244 chip has 16 
tri-state drivers. It is part of a family of devices that are similar to the 7400-series chips 
but with twice as many gates in each chip. These chips are available in multiple types of 
packages, with the most popular being a small-outline integrated circuit (SOIC) package. 
An SOIC package has a similar shape to a DIP, but the SOIC is considerably smaller in 
physical size. 

As integrated circuit technology has improved over time, a system of classifying chips 
according to their size has evolved. The earliest chips produced, such as the 7400-series, 
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The 74244 driver chip. 


B.6 PROGRAMMABLE LOGIC DEVICES 


comprise only a few logic gates. The technology used to produce these chips is referred to 
as small-scale integration (SSI). Chips that include slightly more logic circuitry, typically 
about 10 to 100 gates, represent medium-scale integration (MSI). Until the mid-1980s 
chips that were too large to qualify as MSI were classified as large-scale integration (LSI). 
In recent years the concept of classifying circuits according to their size has become of 
little practical use. Most integrated circuits today contain many thousands or millions of 
transistors. Regardless of their exact size, these large chips are said to be made with very 
large scale integration (VLSI) technology. The trend in digital hardware products is to 
integrate as much circuitry as possible onto a single chip. Thus most of the chips used 
today are built with VLSI technology, and the older types of chips are used rarely. 
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The function provided by each of the 7400-series parts is fixed and cannot be tailored to suit 
a particular design situation. This fact, coupled with the limitation that each chip contains 
only a few logic gates, makes these chips inefficient for building large logic circuits. It is 
possible to manufacture chips that contain relatively large amounts of logic circuitry with 
a structure that is not fixed. Such chips were first introduced in the 1970s and are called 
programmable logic devices (PLDs). 

APLDisa general-purpose chip for implementing logic circuits. It contains a collection 
of logic circuit elements that can be customized in different ways. A PLD can be viewed as 
a “black box” that contains logic gates and programmable switches, as illustrated in Figure 
B.24. The programmable switches allow the logic gates inside the PLD to be connected 
together to implement whatever logic circuit is needed. 


Logic gates 


Inputs 
nputs sant 


(logic variables) 


Outputs 
(logic functions) 


programmable 
switches 


Figure B.24 — Programmable logic device as a black box. 


754 


APPENDIX B ° IMPLEMENTATION TECHNOLOGY 


B.6.1 PROGRAMMABLE LoGic ARRAY (PLA) 


Several types of PLDs are commercially available. The first developed was the pro- 
grammable logic array (PLA). The general structure of a PLA is depicted in Figure B.25. 
Based on the idea that logic functions can be realized in sum-of-products form, a PLA 
comprises a collection of AND gates that feeds a set of OR gates. As shown in the figure, 


the PLA’s inputs x), ..., x, pass through a set of drivers (which provide both the true value 
and complement of each input) into a circuit block called an AND plane, or AND array. 
The AND plane produces a set of product terms P;,..., Py. Each of these terms can be 
configured to implement any AND function of x,,...,x,. The product terms serve as the 
inputs to an OR plane, which produces the outputs f\, ..., fm. Each output can be config- 
ured to realize any sum of P;,..., Px, and hence any sum-of-products function of the PLA 
inputs. 


A more detailed diagram of a small PLA is given in Figure B.26, which shows a PLA 
with three inputs, four product terms, and two outputs. Each AND gate in the AND plane 
has six inputs, corresponding to the true and complemented versions of the three input 
signals. Each connection to an AND gate is programmable; a signal that is connected to 
an AND gate is indicated with a wavy line, and a signal that is not connected to the gate is 
shown with a broken line. The circuitry is designed such that any unconnected AND-gate 
inputs do not affect the output of the AND gate. In commercially available PLAs, several 
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Figure B.25 — General structure of a PLA. 
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Figure B.26 — Gate-level diagram of a PLA. 


methods of realizing the programmable connections exist. Detailed explanation of how a 
PLA can be built using transistors is given in Section B.10. 

In Figure B.26 the AND gate that produces P; is shown connected to the inputs x; and 
xX. Hence P; = x,x2. Similarly, P2 = x1%3, P3 = X1X2x3, and P4 = x;x3. Programmable 
connections also exist for the OR plane. Output f; is connected to product terms P;, Po, 
and P3. It therefore realizes the function fj = x1x2 + x1%3 + X1X2x3. Similarly, output fp = 
X1Xq + X1X2x3 + x1x3. Although Figure B.26 depicts the PLA programmed to implement 
the functions described above, by programming the AND and OR planes differently, each 
of the outputs f; and f; could implement various functions of x;, x2, and x3. The only 
constraint on the functions that can be implemented is the size of the AND plane because 
it produces only four product terms. Commercially available PLAs come in larger sizes 
than we have shown here. Typical parameters are 16 inputs, 32 product terms, and eight 
outputs. 
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Figure B.27 = Customary schematic for the PLA in Figure B.26. 


Although Figure B.26 illustrates clearly the functional structure of a PLA, this style of 
drawing is awkward for larger chips. Instead, it has become customary in technical literature 
to use the style shown in Figure B.27. Each AND gate is depicted as a single horizontal 
line attached to an AND-gate symbol. The possible inputs to the AND gate are drawn as 
vertical lines that cross the horizontal line. At any crossing of a vertical and horizontal 
line, a programmable connection, indicated by an X, can be made. Figure B.27 shows the 
programmable connections needed to implement the product terms in Figure B.26. Each 
OR gate is drawn in a similar manner, with a vertical line attached to an OR-gate symbol. 
The AND-gate outputs cross these lines, and corresponding programmable connections can 
be formed. The figure illustrates the programmable connections that produce the functions 
f\ and f; from Figure B.26. 

The PLA is efficient in terms of the area needed for its implementation on an integrated 
circuit chip. For this reason, PLAs are often included as part of larger chips, such as 
microprocessors. In this case a PLA is created so that the connections to the AND and OR 
gates are fixed, rather than programmable. In Section B.10 we will show that both fixed 
and programmable PLAs can be created with similar structures. 
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B.6.2 PROGRAMMABLE ARRAY LoGIc (PAL) 


In a PLA both the AND and OR planes are programmable. Historically, the programmable 
switches presented two difficulties for manufacturers of these devices: they were hard to 
fabricate correctly, and they reduced the speed-performance of circuits implemented in the 
PLAs. These drawbacks led to the development of a similar device in which the AND plane 
is programmable, but the OR plane is fixed. Such a chip is known as a programmable array 
logic (PAL) device. 

An example of a PAL with three inputs, four product terms, and two outputs is given 
in Figure B.28. The product terms P; and P, are hardwired to one OR gate, and P3 and 
P,4 are hardwired to the other OR gate. The PAL is shown programmed to realize the two 
logic functions fj = x1x2%3 + X1x2x3 and fo = X1)X2 + X1xX2x3. In comparison to the PLA in 
Figure B.27, the PAL offers less flexibility; the PLA allows up to four product terms per 
OR gate, whereas the OR gates in the PAL have only two inputs. To compensate for the 
reduced flexibility, PALs are manufactured in a range of sizes, with various numbers of 
inputs and outputs, and different numbers of inputs to the OR gates. 

An example of a commonly used PAL is called the 22V10 [13], which is depicted in 
Figure B.29. There are 11 input pins that feed the AND plane, and an additional input that 
can also serve as aclock input. The OR gates are of variable size, ranging from 8 to 16 inputs. 


AND plane 


Figure B.28 — An example of a PAL. 
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Figure B.29 The 22V10 PAL device. 


Each output pin has a tri-state driver, which allows the pin to optionally be used as an input 
pin. Extra circuitry is connected to each OR gate and is referred to as a macrocell. 

Figure B.30 shows one of the macrocells in the 22V10 PAL. It connects the OR gate 
to one input on an XOR gate, which feeds a D flip-flop. Since the other input to the XOR 
gate can be programmed to be 0 or 1, it can be used to complement the OR-gate output. 
A 2-to-1 multiplexer allows bypassing of the flip-flop, and the tri-state driver can be either 
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Figure B.30 = The 22V10 macrocell. 


permanently enabled or connected to a product term from the AND plane. Either the Q 
output from the flip-flop or the output of the tri-state driver can be connected to the AND 
plane. If the tri-state driver is disabled, then the corresponding pin can be used as an input. 

A number of companies manufacture PLAs or PALs, or other, similar types of simple 
PLDs (SPLDs), including Altera, Atmel, and Lattice. An interested reader can examine the 
information that these companies provide on their web sites. 


B.6.3 PROGRAMMING OF PLAs AND PALS 


In Figures B.27 and B.28, each connection between a logic signal in a PLA or PAL and the 
AND/OR gates is shown as an X. We describe how these switches are implemented using 
transistors in Section B.10. Users’ circuits are implemented in the devices by configuring, 
or programming, these switches. Commercial chips contain a few thousand programmable 
switches, hence it is not feasible for a user of these chips to specify manually the desired 
programming state of each switch. Instead, CAD systems are employed for this purpose. 
For CAD systems that support targeting of circuits to PLDs, the tools have the capability 
to automatically produce the necessary information for programming each of the switches 
in the device. A computer system that runs the CAD tools may be connected by a cable 
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to a dedicated programming unit. Once the user has completed the design of a circuit, the 
CAD tools generate a file, often called a programming file or fuse map, that specifies the 
state that each switch in the PLD should have, to realize correctly the designed circuit. The 
PLD is placed into the programming unit, and the programming file is transferred from the 
computer system. The programming unit places the chip into a special programming mode 
and configures each switch individually. 

PLAs or PALs used as part of a logic circuit usually reside with other chips on a printed 
circuit board (PCB). The procedure described above assumes that the chip can be removed 
from the circuit board for programming in the programming unit. Removal is made possible 
by using a socket on the PCB, as illustrated in Figure B.31. Although PLAs and PALs are 
available in the DIP packages shown in Figure B.21a, they are also available in another 
popular type of package, called a plastic-leaded chip carrier (PLCC), which is depicted in 
Figure B.31. On all four of its sides, the PLCC package has pins that “wrap around” the 
edges of the chip, rather than extending straight down as in the case of a DIP. The socket 
that houses the PLCC is attached by solder to the circuit board, and the PLCC is held in the 
socket by friction. 

Instead of relying on a programming unit to configure a chip, it would be advantageous 
to be able to perform the programming while the chip is still attached to its circuit board. This 
method of programming is called in-system programming (ISP). It is not usually provided 
for PLAs or PALs, but is available for the more sophisticated chips that are described below. 


Figure B.31) = A PLCC package with socket. 
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B.6.4 COMPLEX PROGRAMMABLE LOoGIc DEvICcEs (CPLDs) 


PLAs and PALs are useful for implementing a wide variety of small digital circuits. Each 
device can be used to implement circuits that do not require more than the number of inputs, 
product terms, and outputs that are provided in the particular chip. These chips are limited 
to fairly modest sizes, typically supporting a combined number of inputs plus outputs of not 
more than 32. For implementation of circuits that require more inputs and outputs, either 
multiple PLAs or PALs can be employed or else a more sophisticated type of chip, called 
a complex programmable logic device (CPLD), can be used. 

A CPLD comprises multiple circuit blocks on a single chip, with internal wiring re- 
sources to connect the circuit blocks. Each circuit block is similar to a PLA or a PAL; we 
will refer to the circuit blocks as PAL-like blocks. An example of a CPLD is given in Figure 
B.32. It includes four PAL-like blocks that are connected to a set of interconnection wires. 
Each PAL-like block is also connected to a subcircuit labeled //O block, which is attached 
to a number of the chip’s input and output pins. 

Figure B.33 shows an example of the wiring structure and the connections to a PAL-like 
block in a CPLD. The PAL-like block includes 3 macrocells (real CPLDs typically have 
about 16 macrocells in a PAL-like block), each consisting of a four-input OR gate (real 
CPLDs usually provide between 5 and 20 inputs to each OR gate). The OR-gate output is 
connected to an XOR gate; the other input to this gate can be programmably connected to | 
or 0; if 1, then the XOR gate complements the OR-gate output, and if 0, then the XOR gate 
has no effect. The macrocell also includes a flip-flop, a multiplexer, and a tri-state driver. 
Each tri-state driver is connected to a pin on the CPLD package. The tri-state driver allows 
each pin to be used either as an output from the CPLD or as an input. 
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Figure B.32 — Structure of a complex programmable logic device (CPLD). 
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PAL-like block (details not shown) 


Figure B.33 A section of the CPLD in Figure B.32. 


The interconnection wiring contains programmable switches that are used to connect 
the PAL-like blocks. Each of the horizontal wires can be connected to some of the vertical 
wires that it crosses. One detail to note is that when a pin is used as an input, the macrocell 
associated with that pin cannot be used and is therefore wasted. Some CPLDs include 
additional connections between the macrocells and the interconnection wiring that avoids 
wasting macrocells in such situations. 

Commercial CPLDs range in size from only 2 PAL-like blocks to more than 100 PAL- 
like blocks. They are available in a variety of packages, including the PLCC package that 
is shown in Figure B.31. Figure B.34a shows another type of package used to house CPLD 
chips, called a quad flat pack (QFP). Like a PLCC package, the QFP package has pins on all 
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(a) CPLD in a Quad Flat Pack (QFP) package 


To computer 


Printed 
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(b) JTAG programming 


Figure B.34 = CPLD packaging and programming. 


four sides, but whereas the PLCC’s pins wrap around the edges of the package, the QFP’s 
pins extend outward from the package, with a downward-curving shape. The QFP’s pins 
are much thinner than those on a PLCC, which means that the package can support a larger 
number of pins; QFPs are available with more than 200 pins, whereas PLCCs are limited 
to fewer than 100 pins. 

Most CPLDs contain the same type of programmable switches that are used in SPLDs, 
which are described in Section B.10. Programming of the switches may be accomplished 
using the same technique described in Section B.6.3, in which the chip is placed into a 
special-purpose programming unit. However, this programming method is rather inconve- 
nient for large CPLDs for two reasons. First, large CPLDs may have more than 200 pins 
on the chip package, and these pins are often fragile and easily bent. Second, to be pro- 
grammed in a programming unit, a socket is required to hold the chip. Sockets for large QFP 
packages are very expensive; they may cost more than the CPLD device itself. For these 
reasons, CPLD devices usually support the ISP technique. A small connector is included 
on the PCB that houses the CPLD, and a cable is connected between that connector and a 
computer system. The CPLD is programmed by transferring the programming information 
generated by a CAD system through the cable, from the computer into the CPLD. The 
circuitry on the CPLD that allows this type of programming has been standardized by the 
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IEEE and is usually called a JTAG port. It uses four wires to transfer information between 
the computer and the device being programmed. The term JTAG stands for Joint Test Action 
Group. Figure B.345 illustrates the use of a JTAG port for programming two CPLDs on a 
circuit board. The CPLDs are connected together so that both can be programmed using 
the same connection to the computer system. Once a CPLD is programmed, it retains the 
programmed state permanently, even when the power supply for the chip is turned off. This 
property is called nonvolatile programming. 

CPLDs are used for the implementation of many types of digital circuits. In industrial 
designs that employ some type of PLD device, CPLDs are used often, while SPLDs are 
becoming less common. A number of companies offer CPLDs, including Altera, Lattice, 
and Xilinx. Information about these products is available on the internet. 


B.6.5 FIELD-PROGRAMMABLE GATE ARRAYS 


The types of chips described above, 7400 series, SPLDs, and CPLDs, are useful for imple- 
mentation of small or moderately-sized logic circuits. One way to quantify a circuit’s size 
is to assume that the circuit is to be built using only simple logic gates and then estimate 
how many of these gates are needed. A commonly used measure is the total number of 
two-input NAND gates that would be needed to build the circuit; this measure is often 
called the number of equivalent gates. 

Using the equivalent-gates metric, the size of a 7400-series chip is simple to measure 
because each chip contains only simple gates. For SPLDs and CPLDs the typical measure 
used is that each macrocell represents about 20 equivalent gates. Thus a typical PAL that 
has eight macrocells can accommodate a circuit that needs up to about 160 gates, and a 
large CPLD that has 500 macrocells can implement circuits of up to about 10,000 equivalent 
gates. 

By modern standards, a logic circuit with 10,000 gates is not large. To implement 
larger circuits, it is convenient to use a different type of chip that has a larger logic capacity. 
A field-programmable gate array (FPGA) is a programmable logic device that supports 
implementation of relatively large logic circuits. FPGAs are quite different from SPLDs and 
CPLDs because FPGAs do not contain AND or OR planes. Instead, FPGAs provide logic 
elements for implementation of the required functions. The general structure of an FPGA 
is illustrated in Figure B.35a. It contains three main types of resources: logic elements, I/O 
blocks for connecting to the pins of the package, and interconnection wires and switches. 
The logic elements are arranged in a two-dimensional array, and the interconnection wires 
are organized as horizontal and vertical routing channels between rows and columns of 
logic blocks. The routing channels contain wires and programmable switches that allow 
the logic elements to be interconnected in many ways. Figure B.35a shows two locations 
for programmable switches; the blue boxes adjacent to logic elements hold switches that 
connect the logic element input and output terminals to the interconnection wires, and the 
blue boxes that are diagonally between logic elements connect one interconnection wire to 
another (such as a vertical wire to a horizontal wire). Programmable connections also exist 
between the I/O blocks and the interconnection wires. The actual number of programmable 
switches and wires in an FPGA varies in commercially available chips. 
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(a) General structure of an FPGA 


(b) Pin grid array (PGA) package (bottom view) 


Figure B.35 A field-programmable gate array (FPGA). 


FPGAs can be used to implement logic circuits of more than a million equivalent gates 
in size. Examples of commercial FPGA products can be found on the websites provided by 
Altera and Xilinx, which are the leading providers of these chips. FPGA chips are available 
in a variety of packages, including the PLCC and QFP packages described earlier. Fig- 
ure B.35b depicts another type of package, called a pin grid array (PGA). A PGA package 
may have up to a few hundred pins in total, which extend straight outward from the bottom of 
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the package, in a grid pattern. Another version of this packaging technology is known as 
the ball grid array (BGA). In a BGA the pins are small round balls, instead of posts. The 
advantage of BGA packages is that the pins are very small; hence more pins can be provided 
on a relatively small package. 

Each logic element in an FPGA typically has a small number of inputs and outputs. 
A variety of FPGA products are on the market, featuring different types of logic elements. 
The most commonly used logic element is a lookup table (LUT), which contains storage 
cells that are used to store the truth table of a logic function. Each cell is capable of holding 
a single logic value, either 0 or 1. LUTs of various sizes may be created, where the size is 
defined by the number of inputs. Figure B.36a shows the structure of a small LUT. It has 
two inputs, x; and x2, and one output, f. It is capable of implementing any logic function of 
two variables. Because a two-variable truth table has four rows, this LUT has four storage 
cells. One cell corresponds to the output value in each row of the truth table. The input 
variables x; and x2 are used as the select inputs of three multiplexers that provide the content 
of one of the four storage cells as the output of the LUT. We discuss storage cells in Sec- 
tion B.9. 


- OOK 


(c) Storage cell contents in the LUT 


Figure B.36 —A two-input lookup table (LUT). 
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To see how a logic function can be realized in the two-input LUT, consider the truth table 
in Figure B.36b. The function f; from this table can be stored in the LUT as illustrated in 
Figure B.36c. The arrangement of multiplexers in the LUT correctly realizes the function f,. 
When x; = x2 = 0, the output of the LUT is driven by the top storage cell, which represents 
the entry in the truth table for x;x2 = 00. Similarly, for all valuations of x; and x2, the logic 
value stored in the storage cell corresponding to the entry in the truth table chosen by the 
particular valuation appears on the LUT output. 

Figure B.37 shows a three-input LUT. It has eight storage cells because a three-variable 
truth table has eight rows. In commercial FPGA chips, LUTs usually have between four 
and six inputs, which require 16 and 64 storage cells, respectively. In addition to a LUT, 
FPGA logic elements usually include a flip-flip as indicated in Figure B.38. 
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Figure B.38 _ Inclusion of a flip-flop in an FPGA logic element. 
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For a logic circuit to be realized in an FPGA, each logic function in the circuit must be 
small enough to fit within a single logic element. In practice, a user’s circuit is automati- 
cally translated into the required form by using CAD tools (see Chapter 10). When a circuit 
is implemented in an FPGA, the logic elements are programmed to realize the necessary 
functions and the routing channels are programmed to make the required interconnections 
between logic elements. FPGAs are configured by using the ISP method, which we ex- 
plained in Section B.6.4. The storage cells in the LUTs in an FPGA are volatile, which 
means that they lose their stored contents whenever the power supply for the chip is turned 
off. Hence the FPGA has to be programmed every time power is applied. Often a small 
memory chip that holds its data permanently, called a programmable read-only memory 
(PROM), is included on the circuit board that houses the FPGA. The storage cells in the 
FPGA are loaded automatically from the PROM when power is applied to the chips. 

A small FPGA that has been programmed to implement a circuit is depicted in Figure 
B.39. The FPGA has two-input LUTs, and there are four wires in each routing channel. 
The figure shows the programmed states of both the logic elements and wiring switches 
in a section of the FPGA. Programmable wiring switches are indicated by an X. Each 
switch shown in blue is turned on and makes a connection between a horizontal and vertical 


a 


Figure B.39 —A section of a programmed FPGA. 
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wire. The switches shown in black are turned off. We describe how the switches are 
implemented by using transistors in Section B.10.1. The truth tables programmed into 
the logic elements in the top row of the FPGA correspond to the functions fj = x;x2 and 
fo = X2x3. The logic element in the bottom right of the figure is programmed to produce 
f=fi the = x1%2 + Xox3. 
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The programmable switches in a PLD provide the important benefit of user programmability, 
but they consume a significant amount of space on the chip and lead to increased cost. 
They also result in a reduction in the speed of operation of circuits, and an increase in 
power consumption. In this section we will introduce some high-capacity integrated circuit 
technologies that do not contain programmable switches. 

To provide the largest number of logic gates, highest circuit speed, or lowest power, a 
so-called custom chip can be manufactured. Whereas a PLD is prefabricated, containing 
logic gates and programmable switches that are programmed to realize a user’s circuit, a 
custom chip is created from scratch. The designer of a custom chip has complete flexibility 
to decide the size of the chip, the number of transistors the chip contains, the placement of 
each transistor on the chip, and the way the transistors are connected together. The process 
of defining exactly where on the chip each transistor and wire is situated is called chip 
layout. For a custom chip the designer may create any layout that is desired. A custom chip 
requires a large amount of design effort and is therefore expensive. Consequently, such 
chips are produced only when standard parts like FPGAs do not meet the requirements. To 
justify the expense of a custom chip, the product being designed must be expected to sell in 
sufficient quantities to recoup the cost. Two examples of products that are usually realized 
with custom chips are microprocessors and memory chips. 

In situations where the chip designer does not need complete flexibility for the layout 
of each individual transistor in a custom chip, some of the design effort can be avoided by 
using a technology known as standard cells. Chips made using this technology are often 
called application-specific integrated circuits (ASICs). An example of a small section of an 
ASIC is depicted in Figure B.40. CAD tools are used to automatically place the standard 
cells into rows and to create interconnecting wires in the routing channels between the rows. 
Many variants of standard cells are usually provided, including basic logic gates, latches, 
flip-flops, and so on. The available standard cells are prebuilt and are stored in a library that 
can be accessed by the designer. In Figure B.40 the interconnection wires are drawn in two 
colors. This scheme is used because metal wires can be created on integrated circuits in 
multiple /ayers, which makes it possible for two wires to cross one another without creating 
a short circuit. The blue wires represent one layer of metal wires, and the black wires are a 
different layer. Each blue square represents a hard-wired connection (called a via) between 
a wire on one layer and a wire on the other layer. In current technology it is possible to 
have ten or more layers of metal wiring. Some of the metal layers can be placed on top of 
the transistors in the logic gates, resulting in a more efficient chip layout. 
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Figure B.40 _A section of two rows in a standard-cell chip. 


Like a custom chip, a standard-cell chip is created from scratch according to a user’s 
specifications. The circuitry shown in Figure B.40 implements the two logic functions 
that we realized in a PLA in Figure B.26, namely, fj = xjx2 + x1%3 + X1X2x3 and fo = 
X1X_ + X1X2x3 + x1x3. Because of the expense involved, a standard-cell chip would never 
be created for a small circuit such as this one, and thus the figure shows only a portion 
of a much larger chip. The layout of individual gates (standard cells) is predesigned and 
fixed. The chip layout can be created automatically by CAD tools because of the regular 
arrangement of the logic gates (cells) in rows. A typical chip has many long rows of logic 
gates with a large number of wires between each pair of rows. The I/O blocks around the 
periphery connect to the pins of the chip package, such as a QFP, PGA, or BGA package. 

Another technology, similar to standard cells, is the gate-array technology. In a gate 
array parts of the chip are prefabricated, and other parts are custom fabricated for a par- 
ticular user’s circuit. This concept exploits the fact that integrated circuits are fabricated 
in a sequence of steps, some steps to create transistors and other steps to create wires to 
connect the transistors together. In gate-array technology, the manufacturer performs most 
of the fabrication steps, typically those involved in the creation of the transistors, without 
considering the requirements of a user’s circuit. This process results in a silicon wafer 
of partially finished chips, called the gate-array template. Later the template is modified, 
usually by fabricating wires that connect the transistors together, to create a user’s circuit 
in each finished chip. The gate-array approach provides cost savings in comparison to the 
custom-chip approach because the gate-array manufacturer can amortize the cost of chip 
fabrication over a large number of template wafers, all of which are identical. Many vari- 
ants of gate-array technology exist. Some have relatively large logic cells, while others are 
configurable at the level of a single transistor. 

An example of a gate-array template is given in Figure B.41. The gate array contains a 
two-dimensional array of logic cells. The chip’s general structure is similar to a standard- 
cell chip except that in the gate array all logic cells are identical. Although the types of logic 
cells used in gate arrays vary, one common example is a two- or three-input NAND gate. 
In some gate arrays empty spaces exist between the rows of logic cells to accommodate 
the wires that will be added later to connect the logic cells together. However, most gate 
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Figure B.41 A sea-of-gates gate array. 


arrays do not have spaces between rows of logic cells, and the interconnection wires are 
fabricated on top of the logic cells. This design is possible because, as discussed for Figure 
B.40, metal wires can be created on a chip in multiple layers. This approach is known as the 
sea-of-gates technology. Figure B.42 depicts a small section of a gate array that has been 
customized to implement the logic function f = x2x%3 + x1x3. As we showed in Section 2.7, 
it is easy to verify that this circuit with only NAND gates is equivalent to the AND-OR 
form of the circuit. 
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In this section we provide more detailed information on several aspects of digital circuits. 
We describe how transistors are fabricated in silicon and give a detailed explanation of how 
transistors operate. We discuss the robustness of logic circuits and discuss the important 
issues of signal propagation delays and power dissipation in logic gates. 


B.8.1 MOSFET FABRICATION AND BEHAVIOR 


To understand the operation of NMOS and PMOS transistors, we need to consider how 
they are built in an integrated circuit. Integrated circuits are fabricated on silicon wafers. A 
silicon wafer is usually 6, 8, or 12 inches in diameter and is somewhat similar in appearance 
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Figure B.42 The logic function f, = x2%3 + x13 in the gate array of Figure B.41. 


to an audio compact disc (CD). Many integrated circuit chips are fabricated on one wafer, 
and the wafer is then cut to provide the individual chips. 

Silicon is an electrical semiconductor, which means that it can be manipulated such 
that it sometimes conducts electrical current and at other times does not. A transistor is 
fabricated by creating areas in the silicon substrate that have an excess of either positive 
or negative electrical charge. Negatively charged areas are called type n, and positively 
charged areas are type p. Figure B.43 illustrates the structure of an NMOS transistor. It has 
type n silicon for both the source and drain terminals, and type p for the substrate terminal. 
Metal wiring is used to make electrical connections to the source and drain terminals. 

When MOSFETs were invented, the gate terminal was made of metal. Now a material 
known as polysilicon is used. Like metal, polysilicon is a conductor. The gate is electrically 
isolated from the rest of the transistor by a layer of silicon dioxide (SiOz), which is a type 
of glass that acts as an electrical insulator between the gate terminal and the substrate of 
the transistor. The transistor’s operation is governed by electrical fields caused by voltages 
applied to its terminals, as discussed below. 

In Figure B.43 the voltage levels applied at the source, gate, and drain terminals are 
labeled Vs, Vg, and Vp, respectively. Consider first the situation depicted in Figure B.43a 
in which both the source and gate are connected to Gnd (Vs = Vg = 0 V). The type n source 
and type n drain are isolated from one another by the type p substrate. In electrical terms two 
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Figure B.43 Physical structure of an NMOS transistor. 


diodes exist between the source and drain. One diode is formed by the p—n junction between 
the substrate and source, and the other diode is formed by the p—n junction between the 
substrate and drain. These back-to-back diodes represent a very high resistance (about 10!* 
Q [1]) between the drain and source that prevents current flow. We say that the transistor 
is turned off, or cut off, in this state. 

Next consider the effect of increasing the voltage at the gate terminal with respect to 
the voltage at the source. Let Vgs represent the gate-to-source voltage. If Vgs is greater 
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than a certain minimum positive voltage, called the threshold voltage Vr, then the transistor 
changes from an open switch to a closed switch, as explained below. The exact level of Vr 
depends on many factors, but it is typically about 0.2 Vpp. 

The transistor’s state when Vgs > Vr is illustrated in Figure B.43b. The gate terminal 
is connected to Vpp, resulting in Vgs = 5 V. The positive voltage on the gate attracts free 
electrons that exist in the type n source terminal, as well as in other areas of the transistor, 
toward the gate. Because the electrons cannot pass through the layer of glass under the 
gate, they gather in the region of the substrate between the source and drain, which is called 
the channel. This concentration of electrons inverts the silicon in the area of the channel 
from type p to type n, which effectively connects the source and the drain. The size of 
the channel is determined by the length and width of the gate. The channel length L is the 
dimension of the gate between the source and drain, and the channel width W is the other 
dimension. The channel can also be thought of as having a depth, which is dependent on 
the applied voltages at the source, gate, and drain. 

No current can flow through the gate node of the transistor, due to the layer of glass 
that insulates the gate from the substrate. A current Jp may flow from the drain node to 
the source. For a fixed value of Vgs > Vr, the value of Ip depends on the voltage applied 
across the channel Vps. If Vps = 0 V, then no current flows. As Vpgs is increased, Ip 
increases approximately linearly with the applied Vps, as long as Vp is sufficiently small 
to provide at least Vr volts across the drain end of the channel, that is Vep > Vr. In this 
range of voltages, namely, 0 < Vps < (Vgs — Vr), the transistor is said to operate in the 
triode region, also called the linear region. The relationship between voltage and current 
is approximated by the equation 


Ip = he | Wes = Vr) Vag = 5¥is| [B.1] 
L 2 

The symbol k/, is called the process transconductance parameter. It is a constant that 

depends on the technology being used and has the units A/V. 

As Vp is increased, the current flow through the transistor increases, as given by Equa- 
tion B.1, but only to a certain point. When Vps = Ves — Vr, the current reaches its max- 
imum value. For larger values of Vps, the transistor is no longer operating in the triode 
region. Since the current is at its saturated (maximum) value, we say that the transistor is in 
the saturation region. The current is now independent of Vpgs and is given by the expression 
1,W 


= =k —(Ves — Vr) B.2 
D= ast (Vag = Vr) [B.2] 

Figure B.44 shows the shape of the current-voltage relationship in the NMOS transistor 
for a fixed value of Vgs > Vr. The figure indicates the point at which the transistor leaves 


the triode region and enters the saturation region, which occurs at Vns = Vas — Vr. 


Example B.3 


Assume the values k/ = 60 pA/V?, W/L =2.0 um/0.5 um, Vs = 0 V, Vg = 5 V, and 
Vr = 1 V. If Vp = 2:5 V, the current in the transistor is given by Equation B.1 as Ip © 1.7 
mA. If Vp = 5 V, the saturation current is calculated using Equation B.2 as Ip © 2 mA. 
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Figure B.44 The current-voltage relationship in the NMOS transistor. 


The PMOS Transistor 

The behavior of PMOS transistors is the same as for NMOS except that all voltages and 
currents are reversed. The source terminal of the PMOS transistor is the terminal with the 
higher voltage level (recall that for an NMOS transistor the source terminal is the one with 
the lower voltage level), and the threshold voltage required to turn the transistor on has a 
negative value. PMOS transistors have the same physical construction as NMOS transistors 
except that wherever the NMOS transistor has type n silicon, the PMOS transistor has type 
Pp, and vice versa. For a PMOS transistor the equivalent of Figure B.43a is to connect 
both the source and gate nodes to Vpp, in which case the transistor is turned off. To turn 
the PMOS transistor on, equivalent to Figure B.43b, we would set the gate node to Gnd, 
resulting in Vey = —S V. 

Because the channel is type p silicon, instead of type n, the physical mechanism for 
current conduction in PMOS transistors is different from that in NMOS transistors. A 
detailed discussion of this concept is beyond the scope of this book, but one implication has 
to be mentioned. Equations B.1 and B.2 use the parameter k’,. The corresponding parameter 
for a PMOS transistor is Ke but current flows more readily in type n silicon than in type p, 
with the result that in a typical technology k, ~ 0.4 x k/. For a PMOS transistor to have 
current capacity equal to that of an NMOS transistor, we must use W/L of about two to 
three times larger in the PMOS transistor. In logic gates the sizes of NMOS and PMOS 
transistors are usually chosen to account for this factor. 


B.8.2 MOSFET On-RESISTANCE 


In Section B.1 we considered MOSFETs as ideal switches that have infinite resistance 
when turned off and zero resistance when on. The actual resistance in the channel when the 
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transistor is turned on, referred to as the on-resistance, is given by Vps/Ip. Using Equation 
B.1 we can calculate the on-resistance in the triode region, as shown below. 


Example B.4 


Consider a CMOS inverter in which the input voltage V,. is equal to 5 V. The NMOS transistor 
is turned on, and the output voltage V;/ is close to 0 V. Hence Vpgs for the NMOS transistor 
is close to zero and the transistor is operating in the triode region. In the curve in Figure 
B.44, the transistor is operating at a point very close to the origin. Although the value of 
Vps is small, it is not exactly zero. In the next section we explain that Vps would typically 
be about 0.1 mV. Hence the current Jp is not exactly zero; it is defined by Equation B.1. In 
this equation we can ignore the term involving Vie because Vpg is small. In this case the 
on-resistance is approximated by 


WwW 
Rps = Vps/Ip = 1/| i Wes - vn| [B.3] 


Assuming the values k’ = 60 wA/V?, W/L = 2.0 wm/0.5 xm, Ves = 5 V, and Vr = 1 V, 
we get Rps © 1kQ. 


B.8.3 VOLTAGE LEVELS IN Locic GATES 


In Figure B.1 we showed that the logic values are represented by a range of voltage levels. 
We now consider the issue of voltage levels more carefully. 

The high and low voltage levels in a logic family can be characterized by the operation 
of its basic inverter. Figure B.45a reproduces the circuit in Figure B.5 for an inverter built 


Vpp 
R 
T stat Vy 7 VoL 
Rps 
(a) NMOS NOT gate (b) V,=5V 


Figure B.45 Voltage levels in the NMOS inverter. 


B.8 PRACTICAL ASPECTS 


with NMOS technology. When V, = 0 V, the NMOS transistor is turned off. No current 
flows; hence Vy = 5 V. When V, = Vpp, the NMOS transistor is turned on. To calculate 
the value of V;, we can represent the NMOS transistor by a resistor with the value Rpg, as 
illustrated in Figure B.45b. Then V; is given by the voltage divider 

R 


DS 
avg = 
f BO dR 
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Assume that R = 25 kQ. Using the result from Example B.4, Rpg = 1 kQ, which gives 
Vp © 0.2 V. 

As indicated in Figure B.45), a current Jsq; flows through the NMOS inverter under 
the static condition V, = Vpp. This current is given by 


Isat = Vp /Rps = 0.2 V/1kQ = 0.2 mA 


This static current has important implications, which we discuss in Section B.8.6. 

In practice, the pull-up device R is usually implemented using a PMOS transistor. Such 
circuits are referred to as pseudo-NMOS circuits. They are fully compatible with CMOS 
circuits; hence a single chip may contain both CMOS and pseudo-NMOS gates. Example 
B.17 discusses a circuit for a pseudo-NMOS inverter and shows how to calculate its output 
voltage levels. 


Example B.5 


The CMOS Inverter 


It is customary to use the symbols Voy and Vo, to characterize the voltage levels in 
a logic circuit. The meaning of Voy is the voltage produced when the output is high. 
Similarly, Voz refers to the voltage produced when the output is low. As discussed above, 
in the NMOS inverter Voy = Vpp and Vo, is about 0.2 V. 

Consider again the CMOS inverter in Figure B.12a. Its output-input voltage relation- 
ship is summarized by the voltage transfer characteristic shown in Figure B.46. The curve 
gives the steady-state value of Vy for each value of V,. When V, = 0 V, the NMOS transistor 
is off. No current flows; hence Vr = Voy = Vpp. When V, = Vpp, the PMOS transistor is 
off, no current flows, and Vy = Vo, = O V. For completeness we should mention that even 
when a transistor is turned off, a small current, called the Jeakage current, may flow through 
it. This current has a slight effect on Voy and Vo,. For example, a typical value of Voz is 
0.1 mV, rather than 0 V [1]. 

Figure B.46 includes labels at the points where the output voltage begins to change 
from high to low, and vice versa. The voltage V7z represents the point where the output 
voltage is high and the slope of the curve equals —1. This voltage level is defined as the 
maximum input voltage level that the inverter will interpret as low, hence producing a 
high output. Similarly, the voltage Vj, which is the other point on the curve where the 
slope equals —1, is the minimum input voltage level that the inverter will interpret as high, 
hence producing a low output. The parameters Voy, Vor, Viz, and Viy are important for 
quantifying the robustness of a logic family, as discussed below. 
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Figure B.46 The voltage transfer characteristic for the CMOS inverter. 


B.8.4 NoIsE MARGIN 


Consider the two NOT gates shown in Figure B.47a. Let us refer to the gates on the left 
and right as N, and N3, respectively. Electronic circuits are constantly subjected to random 
perturbations, called noise, which can alter the output voltage levels produced by the gate 
N,. It is essential that this noise not cause the gate N2 to misinterpret a low logic value as 
a high one, or vice versa. Consider the case where N; produces its low voltage level Voz. 
The presence of noise may alter the voltage level, but as long as it remains less than V7z, 
it will be interpreted correctly by Nz. The ability to tolerate noise without affecting the 
correct operation of the circuit is known as noise margin. For the low output voltage, we 
define the low noise margin as 


NM; = Vit — Vor 


A similar situation exists when N, produces its high output voltage Voy. Any existing 
noise in the circuit may alter the voltage level, but it will be interpreted correctly by N> as 
long as the voltage is greater than Vj. The high noise margin is defined as 


NMy = Vou — Vi 
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(a) A NOT gate driving another NOT gate 


Vpp Vpp 


VA 


(b) The capacitive load at node A 


Figure B.47 Parasitic capacitance in integrated circuits. 
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For a given technology the voltage transfer characteristic of the basic inverter determines the 
levels Vor, Vor, Viz, and Vz. For CMOS we showed in Figure B.46 that Voy, = Vpp and 
Vor = 0 V. By finding the two points where the slope of the voltage transfer characteristic 
is equal to —1, it can be shown [1] that Viz = +(3Vpp +2Vr) and Viz = 4(5Vpp —2Vr). 
For the typical value Vr = 0.2 Vpp, this gives 


NM, = NM y = 0.425 x Vpp 


Hence the available noise margin depends on the power supply voltage level. For Vpp = 5 
V, the noise margin is 2.1 V, and for Vpp = 3.3 V, the noise margin is 1.4 V. 


Example B.6 


B.8.5 DYNAMIC OPERATION OF LoGiIc GATES 


In Figure B.47a the node between the two gates is labeled A. Because of the way in which 
transistors are constructed in silicon, N> has the effect of contributing to a capacitive load 
at node A. Figure B.43 shows that transistors are constructed by using several layers of 
different materials. Wherever two types of material meet or overlap inside the transistor, a 
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capacitor may be effectively created. This capacitance is called parasitic, or stray, capaci- 
tance because it results as an undesired side effect of transistor fabrication. In Figure B.47 
we are interested in the capacitance that exists at node A. A number of parasitic capacitors 
are attached to this node, some caused by N; and others caused by N2. One significant 
parasitic capacitor exists between the input of inverter V2 and ground. The value of this 
capacitor depends on the sizes of the transistors in Nj. Each transistor contributes a gate 
capacitance, C, = W x L x C,,. The parameter C,,, called the oxide capacitance, is a 
constant for the technology being used and has the units fF/jzm. Additional capacitance 
is caused by the transistors in N; and by the metal wiring that is attached to node A. It 
is possible to represent all of the parasitic capacitance by a single equivalent capacitance 
between node A and ground [2]. In Figure B.47b this equivalent capacitance is labeled C. 

The existence of stray capacitance has a negative effect on the speed of operation of 
logic circuits. Voltage across a capacitor cannot change instantaneously. The time needed to 
charge or discharge a capacitor depends on the size of the capacitance C and on the amount 
of current through the capacitor. In the circuit of Figure B.47b, when the PMOS transistor in 
N, is turned on, the capacitor is charged to Vpp; it is discharged when the NMOS transistor 
is turned on. In each case the current flow Jp through the involved transistor and the value 
of C determine the rate of charging and discharging the capacitor. 

Chapter 2 introduced the concept of a timing diagram, and Figure 2.10 shows a timing 
diagram in which waveforms have perfectly vertical edges in transition from one logic 
level to the other. In real circuits, waveforms do not have this “ideal” shape, but instead 
have the appearance of those in Figure B.48. The figure gives a waveform for the input V,. 
in Figure B.47b and shows the resulting waveform at node A. We assume that V, is initially at 


Gnd 


Propagation delay Propagation delay 


Va 


Gnd 


Figure B.48 —_ Voltage waveforms for logic gates. 
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the voltage level Vpp and then makes a transition to 0. Once V, reaches a sufficiently low 
voltage, N; begins to drive voltage V4 toward Vpp. Because of the parasitic capacitance, 
V4 cannot change instantaneously and a waveform with the shape indicated in the figure 
results. The time needed for V4 to change from low to high is called the rise time, t,, which 
is defined as the time elapsed from when Vj, is at 10 percent of Vpp until it reaches 90 
percent of Vpp. Figure B.48 also defines the total amount of time needed for the change at 
V, to cause a change in V4. This interval is called the propagation delay, often written 1,, 
of the inverter. It is the time from when V, reaches 50 percent of Vpp until V4 reaches the 
same level. 

After remaining at 0 V for some time, V, then changes back to Vpp, causing N; to 
discharge C to Gnd. In this case the transition time at node A pertains to a change from 
high to low, which is referred to as the fall time, tr, from 90 percent of Vpp to 10 percent 
of Vpp. As indicated in the figure, there is a corresponding propagation delay for the new 
change in V, to affect V4. In a given logic gate, the relative sizes of the PMOS and NMOS 
transistors are usually chosen such that f, and ty have about the same value. 

Equations B.1 and B.2 specify the amount of current flow through an NMOS transistor. 
Given the value of C in Figure B.47, it is possible to calculate the propagation delay for a 
change in V4 from high to low. For simplicity, assume that V, is initially 0 V; hence the 
PMOS transistor is turned on, and V4 = 5 V. Then V,. changes to Vpp at time 0, causing the 
PMOS transistor to turn off and the NMOS to turn on. The propagation delay is defined as 
the time required to discharge C through the NMOS transistor to the voltage Vpp/2. When 
V,, first changes to Vpp, V4 = 5 V; hence the NMOS transistor will have Vps = Vpp and 
will be in the saturation region. The current [p is given by Equation B.2. Once V, drops 
below Vpp — Vr, the NMOS transistor will enter the triode region where [p is given by 
Equation B.1. For our purposes, we can approximate the current flow as V4 changes from 
Vpp to Vpp/2 by finding the average of the values given by Equation B.2 with Vos = Vpp 
and Equation B.1 with Vps = Vpp/2. Using the basic expression for the time needed to 
charge a capacitor (see Example B.15), we have 

CAV CVpp/2 


— — 
7 Ip Ip 


Substituting for the average value of [p as discussed above, yields [1] 
p= [B.4] 


This expression specifies that the speed of the circuit depends both on the value of C and 
on the dimensions of the transistor. The delay can be reduced by making C smaller or by 
making the ratio W/L larger. The expression shows the propagation time when the output 
changes from a high level to a low level. The low-to-high propagation time is given by the 
same expression but using k, and W/L of the PMOS transistor. 

In logic circuits, Lis usually set to the minimum value that is permitted according to the 
specifications of the fabrication technology used. The value of W is chosen depending on 
the amount of current flow, hence propagation delay, that is desired. Figure B.49 illustrates 
two sizes of transistors. Part (a) depicts a minimum-size transistor, which would be used 
in a circuit wherever capacitive loading is small or where speed of operation is not critical. 
Figure B.49b shows a larger transistor, which has the same length as the transistor in part (a) 
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(a) Small transistor (b) Larger transistor 


Figure B.49 _ Transistor sizes. 


but a larger width. There is a trade-off involved in choosing transistor sizes, because a 
larger transistor takes more space on a chip than a smaller one. Also, increasing W not 
only increases the amount of current flow in the transistor but also results in an increase 
in the parasitic capacitance (recall that the capacitance C, between the gate terminal and 
ground is proportional to W x L), which tends to offset some of the expected improvement 
in performance. In logic circuits large transistors are used where high capacitive loads must 
be driven and where signal propagation delays must be minimized. 


Example B.7 


In the circuit in Figure B.47, assume that C = 70 fF and that W/L = 2.0 wm/0.5 jum. Also, 
ky = 60 pA/V? and Vpp = 5 V. Using Equation B.4, the high-to-low propagation delay of 
the inverter is 4, ~ 0.1 ns. 


B.8.6 POWER DISSIPATION IN LOGIC GATES 


In an electronic circuit it is important to consider the amount of electrical power consumed 
by the transistors. Integrated circuit technology allows fabrication of billions of transistors 
on a single chip; hence the amount of power used by an individual transistor must be small. 
Power dissipation is an important consideration in all applications of logic circuits, but it 
is crucial in situations that involve battery-operated equipment, such as portable computers 
and the like. 

Consider again the NMOS inverter in Figure B.45. When V, = 0, no current flows and 
hence no power is used. But when V, = 5 V, power is consumed because of the current 
Ista The power consumed in the steady state is given by Ps = IsiaxVpp. In Example B.5 
we calculated Js4, = 0.2 mA. The power consumed is then Ps = 0.2mA x 5V = 1.0mW. 
If we assume that a chip contains, say, the equivalent of 10,000 inverters, then the total 
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power consumption is 10 W! Because of this large power consumption, NMOS-style gates 
are used only in special-purpose applications, which we discuss in Section B.8.9. 

To distinguish between power consumed during steady-state conditions and power 
consumed when signals are changing, it is customary to define two types of power. Static 
power is dissipated by the current that flows in the steady state, and dynamic power is 
consumed when the current flows because of changes in signal levels. NMOS circuits 
consume static power as well as dynamic power, while CMOS circuits consume mostly 
dynamic power. Note that we mentioned in Section B.8.3 that a small amount of current, 
called leakage current, flows through transistors that are turned off. Although this leakage 
current causes static power dissipation even when using CMOS circuits, we do not consider 
this effect further in our discussion. 

Consider the CMOS inverter presented in Figure B.12a. When the input V, is low, no 
current flows because the NMOS transistor is off. When V, is high, the PMOS transistor is 
off and again no current flows. Hence no current flows ina CMOS circuit under steady-state 
conditions. Current does flow in CMOS circuits, however, for a short time when signals 
change from one voltage level to another. 

Figure B.50a depicts the following situation. Assume that V,. has been at 0 V for some 
time; hence Vy = 5 V. Now let V, change to 5 V. The NMOS transistor turns on, and it 
pulls Vy toward Gnd. Because of the parasitic capacitance C at node f, voltage V- does not 
change instantaneously, and current Jp flows through the NMOS transistor for a short time 
while the capacitor is being discharged. A similar situation occurs when V, changes from 
5 V to 0, as illustrated in Figure B.50b. Here the capacitor C initially has 0 volts across it 
and is then charged to 5 V by the PMOS transistor. Current flows from the power supply 
through the PMOS transistor while the capacitor is being charged. 

The voltage transfer characteristic for the CMOS inverter, shown in Figure B.46, in- 
dicates that a range of input voltage V,. exists for which both transistors in the inverter are 


w—[ | > 7 


(a) Current flow when input V, (b) Current flow when input V,, 
changes from 0 V to 5 V changes from 5 V to0 V 


Figure B.50 = Dynamic current flow in CMOS circuits. 
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turned on. Within this voltage range, specifically Vr < V, < (Vpp — Vr), current flows 
from Vpp to Gnd through both transistors. This current is often referred to as the short- 
circuit current in the gate. In comparison to the amount of current used to (dis)charge the 
capacitor C, the short-circuit current is negligible in most cases. 

The power used by a single CMOS inverter is extremely small. Consider again the 
situation in Figure B.50a when Vr = Vpp. The amount of energy stored in the capacitor is 
equal to C Vin /2 (see Example B.12). When the capacitor is discharged to 0 V, this stored 
energy is dissipated in the NMOS transistor. Similarly, for the situation in Figure B.50b, the 
energy C Vis /2 is dissipated in the PMOS transistor when C is charged up to Vpp. Thus for 
each cycle in which the inverter charges and discharges C, the amount of energy dissipated 
is equal to C Vans Since power is defined as energy used per unit time, the power dissipated 
in the inverter is the product of the energy used in one discharge/charge cycle times the 
number of such cycles per second, f. Hence the dynamic power consumed is 


Pp =fCVAp 


In practice, the total amount of dynamic power used in CMOS circuits is significantly lower 
than the total power needed in other technologies, such as NMOS. For this reason, virtually 
all large integrated circuits fabricated today are based on CMOS technology. 


Example B.8 


For a CMOS inverter, assume that C = 70 fF and f = 100 MHz. The dynamic power 
consumed by the gate is Pp = 175 wW. If we assume that a chip contains the equivalent of 
10,000 inverters and that, on average, 20 percent of the gates change values at any given time, 
then the total amount of dynamic power used in the chipis Pp = 0.2 x 10,000 x 175 uwW = 
0.35 W. 


B.8.7 PASSING 1S AND 0S THROUGH TRANSISTOR SWITCHES 


In Figure B.4 we showed that NMOS transistors are used as pull-down devices and PMOS 
transistors are used as pull-up devices. We now consider using the transistors in the opposite 
way, that is, using an NMOS transistor to drive an output high and using a PMOS transistor 
to drive an output low. 

Figure B.51a illustrates the case of an NMOS transistor for which both the gate terminal 
and one side of the switch are driven to Vpp. Let us assume initially that both Vg and node 
A are at 0 V, and we then change Vg to 5 V. Node A is the transistor’s source terminal 
because it has the lowest voltage. Since Ves = Vpp, the transistor is turned on and drives 
node A toward Vpp. When the voltage at node A rises, Vgs decreases until the point when 
Vas is no longer greater than Vr. At this point the transistor turns off. Thus in the steady 
state V4 = Vpp — Vr, which means that an NMOS transistor can only partially pass a high 
voltage signal. 

A similar situation occurs when a PMOS transistor is used to pass a low voltage level, 
as depicted in Figure B.51b. Here assume that initially both Vg and node B are at 5 V. Then 
we change Vg to 0 V so that the transistor turns on and drives the source node (node B) 
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(a) NMOS transistor (b) PMOS transistor 


Figure B.51) = NMOS and PMOS transistors used in the opposite way 
from Figure B.4. 


(a) An AND gate circuit (b) Truth table and voltage levels 


Figure B.52 —A poor implementation of a CMOS AND gate. 


toward 0 V. When node B is decreased to V7, the transistor turns off; hence the steady-state 
voltage is equal to Vr. 

In Section B.1 we said that for an NMOS transistor the substrate (body) terminal is 
connected to Gnd and for a PMOS transistor the substrate is connected to Vpp. The voltage 
between the source and substrate terminals, Vsg, which is called the substrate bias voltage, 
is normally equal to 0 V in a logic circuit. But in Figure B.51 both the NMOS and PMOS 
transistors have Vsg = Vpp. The bias voltage has the effect of increasing the threshold 
voltage in the transistor Vr by a factor of about 1.5 or higher [2, 1]. This issue is known as 
the body effect. 

Consider the logic gate shown in Figure B.52. In this circuit the Vpp and Gnd connec- 
tions are reversed from the way in which they were used in previously discussed circuits. 
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When both V,, and V,, are high, then V> is pulled up to the high output voltage, Voy = 
Vpp — 1.5Vr. If Vop = 5 V and V7 = 1 V, then Vow = 3.5 V. When either V,, or V;, is 
low, then Vy is pulled down to the low output voltage, Vo, = 1.5V;7, or about 1.5 V. As 
shown by the truth table in the figure, the circuit represents an AND gate. In comparison to 
the normal AND gate shown in Figure B.15, the circuit in Figure B.52 appears to be better 
because it requires fewer transistors. But a drawback of this circuit is that it offers a lower 
noise margin because of the poor levels of Voy and Vo;. 

Another important weakness of the circuit in Figure B.52 is that it causes static power 
dissipation, unlike a normal CMOS AND gate. Assume that the output of such an AND gate 
drives the input of a CMOS inverter. When V7 = 3.5 V, the NMOS transistor in the inverter 
is turned on and the inverter output has a low voltage level. But the PMOS transistor in 
the inverter is not turned off, because its gate-to-source voltage is —1.5 V, which is larger 
than V7. Static current flows from Vpp to Gnd through the inverter. A similar situation 
occurs when the AND gate produces the low output Vy = 1.5 V. Here the PMOS transistor 
in the inverter is turned on, but the NMOS transistor is not turned off. The AND gate 
implementation in Figure B.52 is not used in practice. 


B.8.8 TRANSMISSION GATES 


In Section B.8.7 we showed that an NMOS transistor passes 0 well and 1 poorly, while a 
PMOS transistor passes 1 well and 0 poorly. It is possible to combine an NMOS and a 
PMOS transistor into a single switch that is capable of driving its output terminal either 
to a low or high voltage equally well. Figure B.53a gives the circuit for a transmission 
gate. As indicated in parts (b) and (c) of the figure, it acts as a switch that connects x to f. 
Switch control is provided by the select input s and its complement s. The switch is turned 
on by setting V; = 5 V and V; = 0. When V, is 0, the NMOS transistor will be turned on 
(because Ves = V; — Vy = 5 V) and Vy will be 0. On the other hand, when V, is 5 V, then 
the PMOS transistor will be on (Ves = Vs — Vy = —5 V) and Vy will be 5 V. A graphical 
symbol for the transmission gate is given in Figure B.53d. 

Transmission gates can be used in a variety of applications. The examples below show 
how they can be used to implement multiplexers, XOR gates, and latch circuits. 


Example B.9 


We showed a circuit that uses transmission gates to implement a multiplexer in Figure 4.1d, 
which is reproduced in Figure B.54. The select input s is used to choose whether the output 
f should have the value of input w; or wo. If s = 0, then f = w,; if s = 1, then f = wo. 


Example B.10 


The XOR function in sum-of-products form is 
X1 DB XQ = XyX2 + X1X2 


which leads to a circuit with two NOT gates, 2 two-input AND gates, and a two-input 
OR gate. We know from Section B.3 that each AND and OR gate requires six transistors, 
while a NOT gate needs two transistors. Hence 22 transistors are required to implement 
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Figure B.53 —_A transmission gate. 


Figure B.54 ~— A 2-to-1 multiplexer built using transmission 
gates. 


this circuit in CMOS technology. It is possible to greatly reduce the number of transistors 
needed by making use of transmission gates. Figure B.55 gives a circuit for an XOR gate 
that uses the multiplexer structure shown in Figure B.54. The output f is set to the value 
of x. when x; = 0 by the top transmission gate. The bottom transmission gate sets f to 
X2 when x; = 1. This circuit needs only six transistors. We discuss the notion of using 
multiplexers to implement logic functions in Section 4.1. 
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Figure B.55 —_ Implementation of an XOR gate. 


Example B.11 Consider the circuit in Figure B.56, which includes two transmission gates. One trans- 


mission gate, TG1, is used to connect the Data input terminal to point A in the circuit. 
The second, TG2, is used as a switch in the feedback loop that maintains the state of the 
circuit. The transmission gates are controlled by the Load signal. If Load = 1, then TG1 
is on and the point A will have the same value as the Data input. Since the value presently 
stored at Output may not be the same value as Data, the feedback loop is broken by having 
TG2 turned off when Load = 1. When Load changes to 0, then TG1 turns off and TG2 
turns on. The feedback path is closed and the memory element will retain its state as long 
as Load = 0. This circuit implements a gated D latch. It is equivalent to the circuit with 
NAND gates shown in Figure 5.7, with the Load input in Figure B.56 acting as the C/k input. 


B.8.9 FAN-IN AND FAN-OUT IN LOGIC GATES 


The fan-in of a logic gate is defined as the number of inputs to the gate. Depending on 
how a logic gate is constructed, it may be impractical to increase the number of inputs 
beyond a small value. For example, consider the NMOS NAND gate in Figure B.57, which 
has k inputs. We wish to consider the effect of k on the propagation delay ¢, through the 
gate. Assume that all k NMOS transistors have the same width W and length L. Because 
the transistors are connected in series, we can consider them to be equivalent to one long 
transistor with length k x L and width W. Using Equation B.4 (which can be applied to 
both CMOS and NMOS gates), the propagation delay is given by 


a ATC 
C= 


P= Tw 
kh 7 Vop 


xk 


Here C is the equivalent capacitance at the output of the gate, including the parasitic 
capacitance contributed by each of the k transistors. The performance of the gate can be 
improved somewhat by increasing W for each NMOS transistor. But this change further 
increases C and comes at the expense of chip area. Another drawback of the circuit is that 
each NMOS transistor has the effect of increasing Voz, hence reducing the noise margin. It 
is practical to build NAND gates in this manner only if the fan-in is small. 
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Figure B.56 —_Using transmission gates to implement a gated D latch. 


ri 


Figure B.57 — High fan-in NUOS NAND gate. 
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Vpp 


Figure B.58 — High fan-in NWYOS NOR gate. 


As another example of fan-in, Figure B.58 shows an NMOS k-input NOR gate. In this 
case the k NMOS transistors connected in parallel can be viewed as one large transistor 
with width k x W and length L. According to Equation B.4, the propagation delay should 
be decreased by the factor k. However, the parallel-connected transistors increase the load 
capacitance C at the gate’s output and, more importantly, it is extremely unlikely that all of 
the transistors would be turned on when V; is changing from a high to low level. It is thus 
practical to build high fan-in NOR gates in NMOS technology. We should note, however, 
that in an NMOS gate the low-to-high propagation delay may be slower than the high-to- 
low delay as a result of the current-limiting effect of the pull-up device (see Examples B.13 
and B.14). 

High fan-in CMOS logic gates always require either k NMOS or k PMOS transistors 
in series and are therefore never practical. In CMOS the only reasonable way to construct 
a high fan-in gate is to use two or more lower fan-in gates. For example, one way to realize 
a six-input AND gate is as 2 three-input AND gates that connect to a two-input AND gate. 
It is possible to build a six-input CMOS AND gate using fewer transistors than needed with 
this approach, but we leave this as an exercise for the reader (see Problem B.4). 


Fan-out 

Figure B.48 illustrated timing delays for one NOT gate driving another. In real circuits 
each logic gate may be required to drive several others. The number of other gates that a 
specific gate drives is called its fan-out. An example of fan-out is depicted in Figure B.59a, 
which shows an inverter Nj; that drives the inputs of n other inverters. Each of the other 
inverters contributes to the total capacitive loading on node f. In part (b) of the figure, 
the n inverters are represented by one large capacitor C,,. For simplicity, assume that each 
inverter contributes a capacitance C and that C, =n x C. Equation B.4 shows that the 
propagation delay increases in direct proportion to 7. 

Figure B.59c illustrates how n affects the propagation delay. It assumes that a change 
from logic value | to 0 on signal x occurs at time 0. One curve represents the case where 
n = 1, and the other curve corresponds to n = 4. Using the parameters from Example B.7, 
when n = 1, we have ¢, = 0.1 ns. Then for n = 4, ¢, © 0.4 ns. It is possible to reduce 1, 
by increasing the W/L ratios of the transistors in Ny. 
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Figure B.59 The effect of fan-out on propagation delay. 


Drivers 


In circuits in which a logic gate has to drive a large capacitive load, drivers are often 
used to improve performance. A driver is a logic gate with one input, x, and one output, 
jf. which produces f = x. The simplest implementation of a driver uses two inverters, as 
shown in Figure B.60a. Drivers can be created with different amounts of drive capability, 
depending on the sizes of the transistors (see Figure B.49). In general, because they are 
used for driving higher-than-normal capacitive loads, drivers have transistors that are larger 
than those in typical logic gates. The graphical symbol for a noninverting driver is given 
in Figure B.60b. 
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Vpp 


(a) Implementation of a driver 


— 
(b) Graphical symbol 


Figure B.60 — A noninverting driver. 


Another type of driver is the inverting driver. It produces the same output as an inverter, 
f =x, but is built with relatively large transistors. The graphical symbol for the inverting 
driver is the same as for the NOT gate; an inverting driver is just a NOT gate that is capable 
of driving large capacitive loads. In Figure B.59 for large values of n an inverting driver 
could be used for the inverter labeled Nj. 

In addition to their use for improving the speed performance of circuits, drivers are 
also used when high current flow is needed to drive external devices. Drivers can handle 
relatively large amounts of current flow because they are built with large transistors. A 
common example of this use of drivers is to control a light-emitting diode (LED). We 
describe an example of this application of drivers in Section 5.14.1. 

In general, fan-out, capacitive loading, and current flow are important issues that the 
designer of a digital circuit must consider carefully. In practice, the decision as to whether 
or not drivers are needed in a circuit is made with the aid of CAD tools. 


B.8.10 TRI-STATE DRIVERS 


As we showed in Figure 7.1, a tri-state driver has one input, w, one output, f, and a control 
input, called enable, e. When e = 0, the driver is completely disconnected from the output 
jf, but when e = 1, the driver provides the value of w on the output f. Figure B.61 shows 
a possible implementation of a tri-state driver using NOT gates and a transmission gate. 


B.8 PRACTICAL ASPECTS 


" f a re j 
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Figure B.61 _Tri-state driver. 


w f w f 
(a) (b) 

Ww if w f 
(c) (d) 


Figure B.62 Four types of tri-state drivers. 


Figure B.62 depicts several types of tri-state drivers. The driver in part (b) has the 
same behavior as the driver in part (a), except that when e = 1, it produces f = w. Part (c) 
of the figure gives a tri-state driver for which the enable signal has the opposite behavior; 
that is, when e = 0, f = w, and when e = 1, f = Z. The term often used to describe this 
type of behavior is to say that the enable is active low. The driver in Figure B.62d also 
features an active-low enable, and it produces f = w when e = 0. 

As a small example of how tri-state drivers can be used, consider the circuit in Figure 
B.63. In this circuit the output f is equal to either w; or w2, depending on the value of s. 
When s = 0, f = wy, and when s = 1, f = w2. Thus, this circuit implements a 2-to-1 
multiplexer. 

In the circuit of Figure B.63, the outputs of the tri-state drivers are wired together. This 
connection is possible because the control input s is connected so that one of the two drivers 
is guaranteed to be in the high-impedance state. The w; driver is active only when s = 0, 
and the w2 driver is active only when s = 1. It would be disastrous to allow both drivers 
to be active at the same time. Doing so would create a short circuit between Vpp and Gnd 
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W 


Figure B.63 = An application of tri-state drivers. 


as soon as the two drivers produce different values. For example, assume that wy = | and 
Wz = 0. The w, driver produces the output Vpp, and the w2 driver produces Gnd. A short 
circuit is formed between Vpp and Gnd, through the transistors in the tri-state drivers. The 
amount of current that flows through such a short circuit is usually sufficient to destroy the 
circuit. 

The kind of wired connection used for the tri-state drivers is not possible with ordinary 
logic gates, because their outputs are always active; hence a short circuit would occur. As 
we already know, for normal logic circuits the equivalent result of the wired connection is 
achieved by using an OR gate to combine signals, as is done in the sum-of-products form. 


B.9  Sratic RANDOM AccEss MEMoryY (SRAM) 


In Chapter 5 we introduced several types of circuits that can be used to store data. Assume 
that we need to store a large number, m, of data items, each of which consists of 7 bits. One 
possibility is to use an n-bit register that consists of n D flip-flops for each data item. We 
would need to design circuitry to control access to each register, both for loading (writing) 
data into it and for reading data out. 

When mm is large, it is awkward to use individual registers to store the data. A better 
approach is to make use of a static random access memory (SRAM) block. An SRAM block 
is a two-dimensional array of SRAM cells, where each cell can store one bit of information. 
If we need to store m items with n bits each, we can use an array of m x n SRAM cells. 
The dimensions of the SRAM array are called its aspect ratio. 

An SRAM block may contain a large number of SRAM cells, so it is important for 
each cell to take as little space on an integrated circuit chip as possible. For this reason, 
the storage cell should use as few transistors as possible. One popular storage cell used in 
practice is depicted in Figure B.64. It operates as follows. To store data into the cell, the 
Sel input is set to 1, and the data value to be stored is placed on the Data input. The SRAM 
cell may include a separate input for the complement of the data, indicated by the transistor 
shown in blue in the figure. For simplicity we assume that this transistor is not included 
in the cell. After waiting long enough for the data to propagate through the feedback path 
formed by the two NOT gates, Se/ is changed to 0. The stored data then remains in the 
feedback loop indefinitely. A possible problem is that when Se/ = 1, the value of Data 


B.9 STATIC RANDOM AccEss MEMory (SRAM) 


Data Data 


Figure B.64 = An SRAM cell. 
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Figure B.65 =A 2 x 2 array of SRAM cells. 


may not be the same as the value being driven by the small NOT gate in the feedback path. 
Hence the transistor controlled by Sel may attempt to drive the stored data to one logic 
value while the output of the small NOT gate has the opposite logic value. To resolve this 
problem, the NOT gate in the feedback path is built using small (weak) transistors, so that 
its output can be overridden with new data. 

To read data stored in the cell, we simply set Se/ to 1. In this case the Data node would 
not be driven to any value by external circuitry, so that the SRAM cell can place the stored 
data on this node. The Data signal is passed through a driver, not shown in the figure, and 
provided as an output of the SRAM block. 

An SRAM block contains an array of SRAM cells. Figure B.65 shows an array with 
two rows of two cells each. In each column of the array, the Data nodes of the cells are 
connected together. Each row, i, has a separate select input, Sel;, that is used to read or 
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write the contents of the cells in that row. Larger arrays are formed by connecting more 
cells to Sel; in each row and by adding more rows. The SRAM block must also contain 
circuitry that controls access to each row in the array. Figure B.66 depicts a 2” x n array 
of the type in Figure B.65, with a decoder that drives the Sel inputs in each row of the array. 
The inputs to the decoder are called Address inputs. This term derives from the notion 
that the location of a row in the array can be thought of as the “address” of the row. The 
decoder has m Address inputs and produces 2” select outputs. If the Write control input is 1, 
then the tri-state drivers at the top of the figure are enabled and the data bits on the inputs 
dy—1,.-., do are stored in the cells of the row selected by the Address inputs. If the Read 
control input is 1, then the tri-state drivers at the bottom of Figure B.66 are enabled and the 
data stored in the row selected by the Address inputs appears on the gn_1, ..., go outputs. In 


Data inputs { d,_1 4y_9 dy 


Sel, 


Address 


Data outputs { In-1  In-2 i) 


Figure B.66 = A 2” x n SRAM block. 
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many practical applications the data inputs and data outputs are connected together. Thus 
the Write and Read inputs must never have the value | at the same time. 

The design of memory blocks has been the subject of intensive research and develop- 
ment. We have described only the basic operation of one type of memory block. The reader 
can refer to books on computer organization for more information [14, 15]. 


B.9.1 SRAM BLOocks IN PLDs 


Most FPGA chips contain SRAM blocks that can be used as part of logic circuits imple- 
mented in the chips. One popular chip has a number of SRAM blocks, each of which 
contains 4096 SRAM cells. The SRAM blocks can be configured to provide different as- 
pect ratios, depending on the needs of the design being implemented. Aspect ratios from 
512 x 8 to 4096 x 1 can be realized using a single SRAM block, and multiple blocks can 
be combined to form larger memory arrays. To include SRAM blocks in a circuit, designers 
use prebuilt modules that are provided in a library as part of the CAD tools, or they write 
Verilog code from which synthesis tools can infer memory blocks. 
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We introduced PLDs in Section B.6. In the chip diagrams shown in that section, the 
programmable switches are represented using the symbol X. We now show how these 
switches are implemented using transistors. 

In commercial SPLDs two main technologies are used to manufacture the programmable 
switches. The oldest technology is based on using metal-alloy fuses as programmable links. 
In this technology each pair of horizontal and vertical wires that cross is connected by a 
small metal fuse. When the chip is programmed, for every connection that is not wanted 
in the circuit being implemented, the associated fuse is melted. The programming process 
is not reversible, because the melted fuses are destroyed. We will not elaborate on this 
technology, because it has mostly been replaced by another method. 

In currently produced PLAs and PALs, programmable switches are implemented using 
a special type of programmable transistor. Because CPLDs comprise PAL-like blocks, the 
technology used in SPLDs is also applicable to CPLDs. We will illustrate the main ideas 
by first describing PLAs. For a PLA to be useful for implementing a wide range of logic 
functions, it should support both functions of only a few variables and functions of many 
variables. In Section B.8.9 we discussed the issue of fan-in of logic gates. We showed that 
when the fan-in is high, the best type of gate to use is the NMOS NOR gate. Hence PLAs 
are usually based on this type of gate. 

As a small example of PLA implementation, consider the circuit in Figure B.67. The 
horizontal wire labeled S; is the output of an NMOS NOR gate with the inputs x2 and 
x3. Thus Sj = x2 +3. Similarly, Sy and $3 are the outputs of NOR gates that produce 
Sy = x; + x3 and $3 = x1 +X. +23. The three NOR gates that produce S$), S>, and $3 
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Figure B.67 ~— An example of a NOR-NOR PLA. 


are arranged in a regular structure that is efficient to create on an integrated circuit. This 
structure is called a NOR plane. The NOR plane is extended to larger sizes by adding 
columns for additional inputs and adding rows for more NOR gates. 

The signals $,, Sy, and S3 serve as inputs to a second NOR plane. This NOR plane is 
turned 90 degrees clockwise with respect to the first NOR plane to make the diagram easier 
to draw. The NOR gate that produces the output f; has the inputs S; and Sy. Thus 


fi = Si + So = (%2 + 3) + (41 + x3) 
Using DeMorgan’s theorem, this expression is equivalent to the product-of-sums expression 
fi = S182 = (%2 +3) (a1 + x3) 


Similarly, the NOR gate with output fo has inputs S$; and S3. Therefore, 


to = S| + S3 = (Xp +3) + (41 + X2 + x3) 
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which is equivalent to 
fy = S183 = (2 +3) (41 +2 +43) 


The style of PLA illustrated in Figure B.67 is called a NOR-NOR PLA. Alternative 
implementations also exist, but because of its simplicity, the NOR-NOR style is the most 
popular choice. The reader should note that the PLA in Figure B.67 is not programmable— 
with the transistors connected as shown, it realizes only the two specific logic functions f; 
and f,. But the NOR-NOR structure can be used in a programmable version of the PLA, as 
explained below. 

Strictly speaking, the term PLA should be used only for the fixed type of PLA de- 
picted in Figure B.67. The proper technical term for a programmable type of PLA is 
field-programmable logic array (FPLA). However, it is common usage to omit the F’. Fig- 
ure B.68a shows a programmable version of a NOR plane. It has n inputs, x;,..., X,, 
and k outputs, S;,...,.S,. At each crossing point of a horizontal and vertical wire there 
exists a programmable switch. The switch comprises two transistors connected in series, an 
NMOS transistor and an electrically erasable programmable read-only memory (EEPROM) 
transistor. 

The programmable switch is based on the behavior of the EEPROM transistor. Elec- 
tronics textbooks, such as [1, 2], give detailed explanations of how EEPROM transistors 
operate. Here we will provide only a brief description. A programmable switch is depicted 
in Figure B.685, and the structure of the EEPROM transistor is given in Figure B.68c. The 
EEPROM transistor has the same general appearance as the NMOS transistor (see Figure 
B.43) with one major difference. The EEPROM transistor has two gates: the normal gate 
that an NMOS transistor has and a second floating gate. The floating gate is so named be- 
cause it is surrounded by insulating glass and is not connected to any part of the transistor. 
When the transistor is in the original unprogrammed state, the floating gate has no effect 
on the transistor’s operation and it works as a normal NMOS transistor. During normal use 
of the PLA, the voltage on the floating gate V, is set to Vpp by circuitry not shown in the 
figure, and the EEPROM transistor is turned on. 

Programming of the EEPROM transistor is accomplished by turning on the transistor 
with a higher-than-normal voltage level (typically, V. = 12 V), which causes a large amount 
of current to flow through the transistor’s channel. Figure B.68c shows that a part of the 
floating gate extends downward so that it is very close to the top surface of the channel. 
A high current flowing through the channel causes an effect, known as Fowler-Nordheim 
tunneling, in which some of the electrons in the channel “tunnel” through the insulating 
glass at its thinnest point and become trapped under the floating gate. After the programming 
process is completed, the trapped electrons repel other electrons from entering the channel. 
When the voltage V. = Vpp is applied to the EEPROM transistor, which would normally 
cause it to turn on, the trapped electrons keep the transistor turned off. Hence in the NOR 
plane in Figure B.68a, programming is used to “disconnect” inputs from the NOR gates. 
For the inputs that should be connected to each NOR gate, the corresponding EEPROM 
transistors are left in the unprogrammed state. 

Once an EEPROM transistor is programmed, it retains the programmed state perma- 
nently. However, the programming process can be reversed. This step is called erasing, and 
it is done by using voltages that are of the opposite polarity to those used for programming. 
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(b) A programmable switch (c) EEPROM transistor 


Figure B.68 — Using EEPROM transistors to create a programmable NOR plane. 
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In this case, the applied voltage causes the electrons that are trapped under the floating gate 
to tunnel back to the channel. The EEPROM transistor returns to its original state and again 
acts like a normal NMOS transistor. 

For completeness, we should also mention another technology that is similar to EEP- 
ROM, called erasable PROM (EPROM). This type of transistor, which was actually created 
as the predecessor of EEPROM, is programmed in a similar fashion to EEPROM. However, 
erasing is done differently: to erase an EPROM transistor, it must be exposed to light energy 
of specific wavelengths. To facilitate this process, chips based on EPROM technology are 
housed in packages with a clear glass window through which the chip is visible. To erase 
a chip, it is placed under an ultraviolet light source for several minutes. Because erasure 
of EPROM transistors is more awkward than the electrical process used to erase EEPROM 
transistors, EPROM technology has essentially been replaced by EEPROM technology in 
practice. 

A complete NOR-NOR PLA using EEPROM technology, with four inputs, six sum 
terms in the first NOR plane, and two outputs, is depicted in Figure B.69. Each pro- 
grammable switch that is programmed to the off state is shown as X in black, and each 
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Figure B.69 — Programmable version of the NOR-NOR PLA. 
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switch that is left unprogrammed is shown in blue. With the programming states shown in 
the figure, the PLA realizes the logic functions fj = (x1 + x3)(%1 + X2)(%1 + x2 + 3) and 
fo = 1 + 3) (41 + x2) 1 + X2). 

Rather than implementing logic functions in product-of-sums form, a PLA can also 
be used to realize the sum-of-products form. For sum-of-products we need to implement 
AND gates in the first NOR plane of the PLA. If we first complement the inputs to the 
NOR plane, then according to DeMorgan’s theorem, this is equivalent to creating an AND 
plane. We can generate the complements at no cost in the PLA because each input is already 
provided in both true and complemented forms. An example that illustrates implementation 
of the sum-of-products form is given in Figure B.70. The outputs from the first NOR 
plane are labeled P;,..., P¢ to reflect our interpretation of them as product terms. The 
signal P, is programmed to realize x, + X2 = x, x2. Similarly, P2 = x1x3, P3 = X1X2x3, and 
P4 =X X2x3. Having generated the desired product terms, we now need to OR them. This 
operation can be accomplished by complementing the outputs of the second NOR plane. 


NOR plane 


NOR plane 


Figure B.7O0 = A NOR-NOR PIA used for sum-of-products. 
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Figure B.70 includes NOT gates for this purpose. The states indicated for the programmable 
switches in the OR plane (the second NOR plane) in the figure yield the following outputs: 
Fi = Pi + Po + P3 = xx. +13 + X1X0x3, and fo = Py) + Pa = xyxX2 + X1X2%3. 

The concepts described above for PLAs can also be used in PALs. Figure B.71 shows a 
PAL with four inputs and two outputs. Let us assume that the first NOR plane is programmed 
to realize product terms in the manner described above. Notice in the figure that the product 
terms are hardwired in groups of three to OR gates that produce the outputs of the PAL. 
As we illustrated in Figure B.29, the PAL may also contain extra circuitry between the OR 
gates and the output pins, which is not shown in Figure B.71. The PAL is programmed 
to realize the same logic functions, f; and jf, that were generated in the PLA in Figure 
B.70. Observe that the product term x;x2 is implemented twice in the PAL, on both P; and 
P,. Duplication is necessary because in a PAL product terms cannot be shared by multiple 
outputs, as they can be in a PLA. Another detail to observe in Figure B.71 is that although 
the function f2 requires only two product terms, each OR gate is hardwired to three product 
terms. The extra product term Ps must be set to logic value 0, so that it has no effect. This 
is accomplished by programming P¢ so that it produces the product of an input and that 
input’s complement, which always results in 0. In the figure, Pg = x,x,; = 0, but any other 
input could also be used for this purpose. 

The PAL-like blocks contained in CPLDs are usually implemented using the techniques 
discussed in this section. In a typical CPLD, the AND plane is built using NMOS NOR 
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Figure B.71_ ~~ PAL programmed to implement the functions in Figure B.70. 
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gates, with appropriate complementing of the inputs. The OR plane is hardwired as it is in 
a PAL, rather than being fully programmable as in a PLA. However, some flexibility exists 
in the number of product terms that feed each OR gate. This flexibility is accomplished by 
using a programmable circuit that can allocate the product terms to whichever OR gates the 
user desires. 


Example B.12 


In Section B.9 we described static random access memory. This type of memory is useful 
when it is necessary to support both reading and writing of data. In many situations it is 
also useful to be able to write data into a memory and then use this stored data permanently 
without further modification. A read-only memory (ROM) is suitable for such applications. 
In a ROM the storage cells are built using EEPROM (or EPROM) transistors, arranged in a 
structure similar to that shown in Figure B.68. Each data bit in the ROM normally provides 
a | if its corresponding EEPROM cell is programmed to be “off”, but this data bit is pulled 
down to 0 if the EEPROM cell is programmed to be “ON”. 

Figure B.72 shows an example of a ROM block. The storage cells are arranged in 2” 
rows with n cells per row. Thus each row stores n bits of information. The location of each 
row in the ROM is identified by its address. In the figure the row at the top has address 0, 
and the row at the bottom has address 2” — 1. The information stored in the rows can be 
accessed by asserting the select lines, Selo to Selg»_;. As shown in the figure, a decoder 
with m inputs and 2” outputs is used to generate the signals on the select lines. Figure B.72 
shows that each data line has an associated tri-state driver that is enabled by the ROM input 
named Read. To access, or read, data from the ROM, the address of the desired row is 
placed on the address lines and Read is set to 1. 


B.10.1 IMPLEMENTATION IN FPGAS 


FPGAs do not use EEPROM technology to implement the programmable switches. Instead, 
the programming information is stored in SRAM cells. An SRAM cell is used for each 
truth-table value stored ina LUT. SRAM cells are also used to configure the interconnection 
wires in an FPGA. 

Figure B.73 depicts a small section of the FPGA from Figure B.39. The logic element 
shown produces the output f,, which is driven onto the horizontal wire drawn in blue. This 
wire can be connected to some of the vertical wires that it crosses, using programmable 
switches. Each switch is implemented using an NMOS transistor, with its gate terminal 
controlled by an SRAM cell. Such a switch is known as a pass-transistor switch. If a 0 is 
stored in an SRAM cell, then the associated NMOS transistor is turned off. But if a 1 is 
stored in the SRAM cell, as shown for the switch drawn in blue, then the NMOS transistor 
is turned on. This switch forms a connection between the two wires attached to its source 
and drain terminals. The number of switches that are provided in the FPGA depends on the 
specific chip architecture. 

In Section B.8.7 we showed that an NMOS transistor can only partially pass a high 
logic value. Hence in Figure B.73 if V» is a high voltage level, then Vy is only partially 
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Figure B.72 =A 2” x n read-only memory (ROM) block. 
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Figure B.73 —_—Pass-transistor switches in FPGAs. 
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Figure B.74 _ Restoring a high voltage level. 


high. Using the values from Section B.8.7, if V;, = 5 V, then V4 = 3.5 V. As we explained 
in Section B.8.7, this degraded voltage level has the result of causing static power to be 
consumed (see Example B.19). One solution to this problem [1] is illustrated in Figure B.74. 
We assume that the signal V4 passes through another pass-transistor switch before reaching 
its destination at another logic element. The signal Vg has the same value as V4 because the 
threshold voltage drop occurs only when passing through the first pass-transistor switch. 
To restore the level of Vg, it is driven with an inverter. A PMOS transistor is connected 
between the input of this inverter and Vpp, and that transistor is controlled by the inverter’s 
output. The PMOS transistor has no effect on the inverter’s output voltage level when 
Vz = 0 V. But when Vz = 3.5 V, then the inverter output is low, which turns on the PMOS 
transistor. This transistor quickly restores Vg to the proper level of Vpp, thus preventing 
current from flowing in the steady state. Instead of using this pull-up transistor solution, 
another possible approach is to alter the threshold voltage of the PMOS transistor (during 
the integrated circuit manufacturing process) in the inverter in Figure B.74, such that the 
magnitude of its threshold voltage is large enough to keep the transistor turned off when 
Vp = 3.5 V. In commercial FPGAs both of these solutions are used in different chips. 

An alternative to using a single NMOS transistor as a switch is to use a transmission gate, 
described in Section B.8.8. While this approach solves the voltage-level problem, it has 
two drawbacks. First, having both an NMOS and PMOS transistor in the switch increases 
the capacitive loading on the interconnection wires, which increases the propagation delays 
and power consumption. Second, the transmission gate takes more chip area than does a 
single NMOS transistor. For these reasons, commercial FPGA chips do not currently use 
transmission-gate switches. 


B.11 CONCLUDING REMARKS 


We have described the most important concepts that are needed to understand how logic 
gates are built using transistors. Our discussions of transistor fabrication, voltage levels, 
propagation delays, power dissipation, and the like are meant to give the reader an appre- 
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ciation of the practical issues that have to be considered when designing and using logic 
circuits. 

We have introduced several types of integrated circuit chips. Each type of chip is 
appropriate for different applications. The standard chips, such as the 7400 series, contain 
only a few simple gates and are rarely used today. Exceptions to this are the driver chips, 
which are employed when large amounts of current are needed. The various types of PLDs 
are widely used in many types of applications. Simple PLDs, like PLAs and PALs, are 
appropriate for implementation of small logic circuits. The SPLDs offer low cost and high 
speed. CPLDs can be used for the same applications as SPLDs, but CPLDs are also well 
suited for implementation of larger circuits, up to about 10,000 to 20,000 gates. Many of 
the applications that can be targeted to CPLDs can alternatively be realized with FPGAs. 
Which of these two types of chips are used in a specific design situation depends on many 
factors. Following the trend of putting as much circuitry as possible into a single chip, 
FPGAs are more widely used than other PLDs. Most digital designs created in the industry 
today contain some type of FPGA. 

The gate-array, standard-cell, and custom-chip technologies are used in cases where 
PLDs are not appropriate. Typical applications are those that entail very large circuits, 
require extremely high speed-of-operation, need low power consumption, and where the 
designed product is expected to sell in large volume. 
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This section presents some typical problems that the reader may encounter, and shows how 
such problems can be solved. 


Problem: We introduced standard cell technology in Section B.7. In this technology, 
circuits are built by interconnecting building-block cells that implement simple functions, 
like basic logic gates. A commonly used type of standard cell are the and-or-invert (AOT) 
cells, which can be efficiently built as CMOS complex gates. Consider the AOI cell shown 
in Figure B.75. This cell implements the function f = x)x2 + x3x4 + x5. Derive the CMOS 
complex gate that implements this cell. 


Figure B.75 = The AO! cell for Example B.13. 


Example B.13 
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Example B.14 
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Solution: Applying Demorgan’s theorem in two steps gives 


Sf = 1x2 + X3X4 + XS 


= (X1 +.X2) + (3 + X4) - X5 


Since all input variables are complemented in this expression, we can directly derive 
the pull-up network as having parallel-connected PMOS transistors controlled by x; and 
X2, in series with parallel-connected transistors controlled by x3 and x4, in series with a 
transistor controlled by x5. This circuit, along with the corresponding pull-down network, 
is shown in Figure B.76. 


Problem: For the CMOS complex gate in Figure B.76, determine the sizes of transistors 
that should be used such that the speed performance of this gate is similar to that of an 
inverter. 


Solution: Recall from Section B.8.5 that a transistor with length L and width W has a drive 
strength proportional to the ratio W/L. Also recall that when transistors are connected in 


Vpp 


Figure B.76 — Circuit for Examples B.13 and B.14. 
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parallel their widths are effectively added, leading to an increase in drive strength. Similarly, 
when transistors are connected in series, their lengths are added, leading to a decrease in 
drive strength. Let us assume that all NMOS and PMOS transistors have the same length, 
L, = Ly = L. In Figure B.76 the NMOS transistor connected to input V,, can have the 
same width as in an inverter, W,,. But the worst-case path through the pull-down network 
in this circuit involves two NMOS transistors in series. For these NMOS transistors, which 
are connected to inputs V,,,..., V,,, we should make the widths equal to 2 x W,,. For the 
pull-up network, the worst-case path involves three transistors in series. Since, as we said 
in Section B.8.1, PMOS transistors have about half the drive strength of NMOS transistors, 
we should make the effective width of the PMOS transistors 


W, =3x W, x 2=6W, 
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Problem: In Section B.8.5, we said that the time needed to charge a capacitor is given by 
CAV 


th = T 


Derive this expression. 


Solution: As we stated in Section B.8.5, the voltage across a capacitor cannot change 
instantaneously. In Figure B.50b, as V; is charged from 0 volts toward Vpp, the voltage 
changes according to the equation 


1 CO 
Vp = rao 
0 


In this expression, the independent variable ¢ is time, and i(t) represents the instantaneous 
current flow through the capacitor at time t. Differentiating both sides of this expression 
with respect to time, and rearranging gives 


dVy 
i(t) = ae 
For the case where / is constant, we have 
I AV 
Cc Ar 
Therefore, 
At=t, = a 


Example B.15 


Problem: In our discussion of Figure B.50a, in Section B.8.6, we said that a capacitor, Example B.16 


C, that has been charged to the voltage Vy = Vpp, stores an amount of energy equal to 
CV is /2. Derive this expression. 
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Solution: As shown in Example B.15, the current flow through a charging capacitor, C, is 
related to the rate of change of voltage across the capacitor, according to 
dV 
i(t) = C—— 
o dt 
The instantaneous power dissipated in the capacitor is 
P=i(t) x Vy 


Since energy is defined as the power used over a time period, we can calculate the energy, 
Ec, stored in the capacitor as Vy changes from 0 to Vpp by integrating the instantaneous 
power over time, as follows 


(oe) 


Bo = f in vpat 
0 


Substituting the above expression for i(t) gives 


dV 
f 
Ec = | C——V,rdt 
. / dt f 
0 


Vop 
= c| Ve dV 
0 


Lig 
= 3 CV pp 


Example B.17 


Problem: In the original NMOS technology, the pull-up device was an n-channel MOSFET. 
But most integrated circuits fabricated today use CMOS technology. Hence it is convenient 
to implement the pull-up resistor using a PMOS transistor, as shown in Figure B.77. Such 
a circuit is referred to as a pseudo-NMOS circuit. The pull-up device is called a “weak” 
PMOS transistor because it has a small W/L ratio. 

When V, = Vpp, Vy has a low value. The NMOS transistor is operating in the triode 
region, while the PMOS transistor limits the current flow because it is operating in the 
saturation region. The current through the NMOS and PMOS transistors has to be equal 
and is given by Equations B.1 and B.2. Show that the low-output voltage, Vy = Vo, is 
given by 


kp 
Kp 


Vp = (Vpp — Vr) | 1 1 


where k, and k,, called the gain factors, depend on the sizes of the PMOS and NMOS 
transistors, respectively. They are defined by k, = k, W,/Lp and k, = ki W,,/Ln. 
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Vop 


a 
“| 


Figure B.77 — The pseudo-NMOS inverter. 


Solution: For simplicity we will assume that the magnitude of the threshold voltages for 
both the NMOS and PMOS transistors are equal, so that 


Vr = Vryn = —Vrp 


The PMOS transistor is operating in the saturation region, so the current flowing through it 
is given by 


1 / 
Ipn= a> aa Vp — Vrp)” 


= sto(—Voo —Vrp) 
I 2 
= 5 (Vpp — Vr) 


Similarly, the NMOS transistor is operating in the triode region, and its current flow is 
defined by 


W, 1 
Ip =k, — | (Vi — Vrw) V7 — = VP 
Dak \ tN )Ve 5 7] 
ls 
=k, Wa Men) te 25 Ve 


1 
=F, | Won —Vr)Vy — | 


Since there is only one path for current to flow, we can equate the currents flowing through 
the NMOS and PMOS transistors and solve for the voltage V;. 


2 
ky(Vpp — Vr)? — 2kn(Vop — Vr) Vy + kVp = 0 


1 
ky(Vpp — Vr)° = 2kn | Won —Vr)Vp — = A 


811 


812 APPENDIX B~— e IMPLEMENTATION TECHNOLOGY 


This quadratic equation can be solved using the standard formula, with the parameters 


a=k,, b= —2kn(Vpp — Vr), ¢ = kp(Vpp — Vr)” 


yn we |B c 
f 9a V 4a a 


k 
(Vpp — Vr) = | 0 — Vr) - z_ Von — Vr)? 


which gives 


kp 
= (Vpp — Vr) | 1= i 


n 


Only one of these two solutions is valid, because we started with the assumption that the 
NMOS transistor is in the triode region while the PMOS is in the saturation region. Thus 


Ve = (Vpp — Vr) | 1 1— 


Example B.18 Problem: For the circuit in Figure B.77, assume the values k/ = 60 A/V°, k, =0.4k,, 
W,,/Ly = 2.0 wm/0.5 wm, W,/L, = 0.5 wm/0.5 wm, Vpp = 5 V, and Vr = 1 V. When 
V, = Vpp, calculate the following: 
(a) The static current, Isjar. 
(b) The on-resistance of the NMOS transistor. 
(c) Vor. 
(d) The static power dissipated in the inverter. 
(e) The on-resistance of the PMOS transistor. 
(f) Assume that the inverter is used to drive a capacitive load of 70 fF. Using Equation B.4, 
calculate the low-to-high and high-to-low propagation delays. 


Solution: (a) The PMOS transistor is saturated, therefore 


1 Wp 2 
Istat = 2? y, =r) 


LA 2 
= lL x lx (GV-1V)° = 192 vA 
(b) Using Equation B.3, 
, Wr 
Rps = 1/ Ee (Ves — vn| 


mA 
=1/ 0.060 x4xSbV—- 1v)| = 1.04kQ 
V 
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(c) Using the expression derived in Example B.17 we have 


W, A 
gow 2a 5 
iy Vv 
W, A 
ky, =k" = 240 8 
L, Vv 


24 
a =(5V—-1V)}/1—,/1- —~ 
Vor = Ve = (5 | Ea 


=0.21V 
(d) 
Pp = Ista X Vop 
= 192 vAx 5V =960 n.W ~ 1 mW 
(e) 


Rspp = Vsp/Isp 
= (Vpp — Vp) /Tstat 
= (5 V — 0.21 V)/0.192 mA = 24.9 kQ 


(f) The low-to-high propagation delay is 


1.7C 
PLH 7, Wo 
P Ly 


_ 17x 70 fF 
(44K Ix5V 


Vpp 


= 0.99 ns 


The high-to-low propagation delay is 


, __LIC 
PHL _ ki Me Vip 
1.7 x 70 fF 


= —— = 0). | ns 
604 x 4x 5V 


Problem: In Figure B.74 we showed a solution to the static power dissipation problem Example B.19 
when NMOS pass transistors are used. Assume that the PMOS pull-up transistor is re- 

moved from this circuit. Assume the parameters k/, = 60 A/ Vv’, ky, =0.5 x ki, W/L, = 

2.0 wm/0.5 wm, W,/L, = 4.0 wm/0.5 wm, Vpp = 5 V, and Vr = 1 V. For Vg = 3.5 V, 

calculate the following: 
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(a) The static current I¢jat- 

(b) The voltage V; at the output of the inverter. 

(c) The static power dissipation in the inverter. 

(d) If a chip contains 250,000 inverters used in this manner, find the total static power 
dissipation. 


Solution: (a) If we assume that the PMOS transistor is operating in the saturation region, 
then the current flow through the inverter is defined by 


Tear = eit is — Vy) 
stat — Pp 
27 i 


HA a 
eg eS Ye eNa Ty) = 30 uA 


(b) Since the static current, sq, flowing through the PMOS transistor also flows through 
the NMOS transistor, then assuming that the NMOS transistor is operating in the triode 
region, we have 


! W,, 1 2 
Tstat = nT | Wes saad Vin) Vos a 5¥2s| 


pA 1. 
rh BOVINE in Ve 
1 = 20Vy — 4VP 


Solving this quadratic equation yields Vy = 0.05 V. Note that the output voltage V; satisfies 
the assumption that the PMOS transistor is operating in the saturation region while the 
NMOS transistor is operating in the triode region. 

(c) The static power dissipated in the inverter is 


Ps = I stat x Vpp = 30 vA x 5V= 150 wW 
(d) The static power dissipated by 250,000 inverters is 


250,000 x Ps = 37.5 W 


PROBLEMS 


Answers to problems marked by an asterisk are given at the back of the book. 


Consider the circuit shown in Figure PB.1. 

(a) Show the truth table for the logic function f. 

(b) If each gate in the circuit is implemented as a CMOS gate, how many transistors are 
needed? 
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Figure PB.1_ =A sum-of-products CMOS circuit. 


B.2 (a) Show that the circuit in Figure PB.2 is functionally equivalent to the circuit in Fig- 
ure PB.1. 
(b) How many transistors are needed to build this CMOS circuit? 


x) 
a) 


Figure PB.2) = A CMOS circuit built with multiplexers. 


B.3 (a) Show that the circuit in Figure PB.3 is functionally equivalent to the circuit in Fig- 
ure PB.2. 
(b) How many transistors are needed to build this CMOS circuit if each XOR gate is 
implemented using the circuit in Figure B.55? 


xy A 
x3 


Figure PB.3 Circuit for Problem B.3. 
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In Section B.8.9 we said that a six-input CMOS AND gate can be constructed using two 
three-input AND gates and a two-input AND gate. This approach requires 22 transistors. 
Show how you can use only CMOS NAND and NOR gates to build the six-input AND 
gate, and calculate the number of transistors needed. (Hint: Use DeMorgan’s theorem.) 


Repeat Problem B.4 for an eight-input CMOS OR gate. 


(a) Give the truth table for the CMOS circuit in Figure PB.4. 

(b) Derive a canonical sum-of-products expression for the truth table from part (a). How 
many transistors are needed to build a circuit representing the canonical form if only AND, 
OR, and NOT gates are used? 


Vop 


Figure PB.4 =A three-input CMOS circuit. 


(a) Give the truth table for the CMOS circuit in Figure PB.5. 

(b) Derive the simplest sum-of-products expression for the truth table in part (a). How 
many transistors are needed to build the sum-of-products circuit using CMOS AND, OR, 
and NOT gates? 

Figure PB.6 shows half of a CMOS circuit. Derive the other half that contains the PMOS 
transistors. 

Figure PB.7 shows half of a CMOS circuit. Derive the other half that contains the NMOS 
transistors. 


PROBLEMS 


Vpp 


Figure PB.5 —A four-input CMOS circuit. 


Figure PB.6 ~The PDN in a CMOS circuit. 
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Vpp 
V,, —d 
V,, 4 
V, 
f 
1 
e 
Ve : 


Figure PB.7 = The PUN in a CMOS circuit. 


B.10 Derive a CMOS complex gate for the logic function f (x1, x2, x3, x4) = )- m(0, 1, 2, 4, 5, 
6, 8, 9, 10). 


B.11 Derive a CMOS complex gate for the logic function f (x), x2, x3, x4) = > m(0, 1, 2, 4, 6, 
8, 10, 12, 14). 


*B.12 Derive a CMOS complex gate for the logic function f = xy + xz. Use as few transistors as 
possible (Hint: Consider f). 


B.13 Derive a CMOS complex gate for the logic function f = xy + xz + yz. Use as few transis- 
tors as possible (Hint: Consider f). 


*B.14 For an NMOS transistor, assume that k’ = 20 wA/V*, W/L = 2.5 ym/0.5 um, Ves 
5 V, and Vr = 1 V. Calculate 
(a) Ip when Vps = 5 V 
(b) Ip when Vos =0.2V 


B.15 For a PMOS transistor, assume that ky = 10 pA/V?, W/L = 2.5 wm/0.5 um, Ves 
—5 V, and V; = —1 V. Calculate 
(a) Ip when Vps = —5 V 
(b) Ip when Vos = —0.2V 


B.16 For an NMOS transistor, assume that k/ = 20 wA/V?, W/L = 5.0 wm/0.5 wm, Ves 
5 V, and Vr = 1 V. For small Vps, calculate Rpg. 


*B.17 For an NMOS transistor, assume that k/ = 40 wA/V?, W/L = 3.5 wm/0.35 wm, Ves 
3.3 V, and V7 = 0.66 V. For small Vps, calculate Rpg. 


B.18 For a PMOS transistor, assume that k, = 10 LA/V?, W/L =5.0 um/0.5 um, Vos 
—5 V, and V7 = —1 V. For Vps = —4.8 V, calculate Rps. 


B.19 For a PMOS transistor, assume that k, = 16 pA/V?, W/L = 3.5 wm/0.35 wm, Ves 
—3.3 V, and V7 = —0.66 V. For Vps = —3.2 V, calculate Rps. 


B.20 


*B.21 


B.22 


B.23 


B.24 
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In Example B.17 we showed how to calculate voltage levels in a pseudo-NMOS inverter. 
Figure PB.8 depicts a pseudo-PMOS inverter. In this technology, a weak NMOS transistor 
is used to implement a pull-down resistor. 

When V, = 0, Vy has a high value. The PMOS transistor is operating in the triode 
region, while the NMOS transistor limits the current flow, because it is operating in the 
saturation region. The current through the PMOS and NMOS transistors has to be the same 
and is given by Equations B.! and B.2. Find an expression for the high-output voltage, 
Vr = Vou, in terms of Vpp, Vr, kp», and k,, where k, and k, are gain factors as defined in 
Example B.17. 


Vpp 


Vpp Vy 


Figure PB.8 — The pseudo-PMOS inverter. 


For the circuit in Figure PB.8, assume the values k/ = 60 “A/V, ki, =0.4k,, Wn/Ln = 
0.5 um/0.5 wm, W,/L, = 4.0 um/0.5 wm, Vpp = 5 V and Vr = | V. When V, = 0, cal- 
culate the following: 

(a) The static current, [sia 

(b) The on-resistance of the PMOS transistor 

(c) Vo 

(d) The static power dissipated in the inverter 

(e) The on-resistance of the NMOS transistor 

(f) Assume that the inverter is used to drive a capacitive load of 70 fF. Using Equation B.4, 
calculate the low-to-high and high-to-low propagation delays. 


Repeat Problem B.21 assuming that the size of the NMOS transistor is changed to W,,/L, = 
4.0 uwm/0.5 um. 


Example B.17 (see Figure B.77) shows that in the pseudo-NMOS technology the pull-up 
device is implemented using a PMOS transistor. Repeat this problem for a NAND gate 
built with pseudo-NMOS technology. Assume that both of the NMOS transistors in the 
gate have the same parameters, as given in Example B.18. 


Repeat Problem B.23 for a pseudo-NMOS NOR gate. 
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B.29 


B.30 


B.31 


*B.32 
B.33 
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(a) For Vig = 4 V, Vow = 4.5 V, Vi, = 1 V, Vor = 0.3 V, and Vpp = 5 V, calculate the 
noise margins NMy and NM,. 

(b) Consider an eight-input NAND gate built using NMOS technology. If the voltage drop 
across each transistor is 0.1 V, what is Vo,’? What is the corresponding NM,, using the other 
parameters from part (a). 


Under steady-state conditions, for an n-input CMOS NAND gate, what are the voltage 
levels of Voz and Voy? Explain. 


For a CMOS inverter, assume that the load capacitance is C = 150 fF and Vpp = 5 V. 
The inverter is cycled through the low and high voltage levels at an average rate of f = 
75 MHz. 

(a) Calculate the dynamic power dissipated in the inverter. 

(b) For a chip that contains the equivalent of 250,000 inverters, calculate the total dynamic 
power dissipated if 20 percent of the gates change values at any given time. 


Repeat Problem B.27 for C = 120 fF, Vpp = 3.3 V, and f = 125 MHz. 


InaCMOS inverter, assume that k) = 20 wA/V?, ky, = 0.4 x kj), W,,/L, = 5.0 wm/0.5 um, 
W,/Lp = 5.0 wm/0.5 wm, and Vpp = 5 V. The inverter drives a load capacitance of 
150 fF. 

(a) Find the high-to-low propagation delay. 

(b) Find the low-to-high low propagation delay. 

(c) What should be the dimensions of the PMOS transistor such that the low-to-high and 
high-to-low propagation delays are equal? Ignore the effect of the PMOS transistor’s size 
on the load capacitance of the inverter. 


Repeat Problem B.29 for the parameters k’/, = 40 wA/V°, k, = 0.4 x k,, 
3.5 um/0.35 wm, and Vpp = 3.3 V. 


In a CMOS inverter, assume that W,,/L, = 2 and W,/L, = 4. For a CMOS NAND gate, 
calculate the required W/L ratios of the NMOS and PMOS transistors such that the available 
current in the gate to drive the output both low and high is equal to that in the inverter. 


Repeat Problem B.31 for a CMOS NOR gate. 


Wi /Ln = W,/L =— 


Repeat Problem B.31 for the CMOS complex gate in Figure B.16. The transistor sizes 
should be chosen such that in the worst case the available current is at least as large as in 
the inverter. 


Repeat Problem B.31 for the CMOS complex gate in Figure B.17. 


In Figure B.74 we showed a solution to the static power dissipation problem when NMOS 
pass transistors are used. Assume that the PMOS pull-up transistor is removed from this cir- 
cuit. Assume the parameters k’, = 60 pA/V?, ky, = 0.4 xk), W,/L, = 1.0 wm/0.25 wm, 
W,/Lp = 2.0 wm/0.25 um, Vpp = 2.5 V, and Vr = 0.6 V. For Vg = 1.6 V, calculate the 
following: 

(a) the static current, [jar 


(b) the voltage, V;, at the output of the inverter 


B.36 


B.37 


B.38 


B.39 


B.40 
B.41 
B.42 


B.43 


B.44 


*B.45 


B.46 


B.47 


B.48 
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(c) the static power dissipation in the inverter 
(d) If a chip contains 500,000 inverters used in this manner, find the total static power 
dissipation. 


Using the style of drawing in Figure B.70, draw a picture of a PLA programmed to implement 
fi, X2, x3) = )>m(1, 2,4, 7). The PLA should have the inputs x), ..., 3; the product 
terms P;,..., P4; and the outputs f| and fo. 


Using the style of drawing in Figure B.70, draw a picture of a PLA programmed to implement 
fix, x2, x3) = >> m(0, 3, 5, 6). The PLA should have the inputs x), ..., x3; the product 
terms P,,..., P4; and the outputs f, and fs. 


Show how the function f; from Problem B.36 can be realized in a PLA of the type shown in 
Figure B.71. Draw a picture of such a PLA programmed to implement f,. The PLA should 
have the inputs x), ...,.+3; the sum terms S,,..., $4; and the outputs f, and fo. 


Show how the function f; from Problem B.37 can be realized in a PLA of the type shown in 
Figure B.71. Draw a picture of such a PLA programmed to implement f;. The PLA should 
have the inputs x), ...,2+3; the sum terms S,,..., $4; and the outputs f, and fo. 


Repeat Problem B.38 using the style of PLA drawing shown in Figure B.67. 
Repeat Problem B.39 using the style of PLA drawing shown in Figure B.67. 


Given that f; is implemented as described in Problem B.36, list all of the other possible 
logic functions that can be realized using output f; in the PLA. 


Given that f; is implemented as described in Problem B.37, list all of the other possible 
logic functions that can be realized using output fo in the PLA. 


Consider the function f (x1, x2, 3) = XyX2 + x1x3 + %2X3. Show a circuit using 5 two-input 
lookup-tables (LUTs) to implement this expression. As shown in Figure B.39, give the 
truth table implemented in each LUT. You do not need to show the wires in the FPGA. 


Consider the function f (x1, x2, x3) = >> m(2, 3, 4, 6, 7). Show how it can be realized using 
two two-input LUTs. As shown in Figure B.39, give the truth table implemented in each 
LUT. You do not need to show the wires in the FPGA. 


Given the function f = x;x2x%4 + x2x3X4 + X1X2X3, a straightforward implementation in an 
FPGA with three-input LUTs requires four LUTs. Show how it can be done using only 3 
three-input LUTs. Label the output of each LUT with an expression representing the logic 
function that it implements. 


For f in Problem B.46, show a circuit of two-input LUTs that realizes the function. You 
are to use exactly seven two-input LUTs. Label the output of each LUT with an expression 
representing the logic function that it implements. 


Figure B.39 shows an FPGA programmed to implement a function. The figure shows one 
pin used for function f, and several pins that are unused. Without changing the programming 
of any switch that is turned on in the FPGA in the figure, list 10 other logic functions, in 
addition to f, that can be implemented on the unused pins. 
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B.49 Assume that a gate array contains the type of logic cell depicted in Figure PB.9. The inputs 
in,,..., in7 can be connected to either | or 0, or to any logic signal. 
(a) Show how the logic cell can be used to realize f = x1x2 + x3. 
(b) Show how the logic cell can be used to realize f = x13 + x2%3. 


inl in2 in3 


in4 in5 in6 in7 


Figure PB.9 —A gate-array logic cell. 


B.50 Assume that a gate array exists in which the logic cell used is a three-input NAND gate. The 
inputs to each NAND gate can be connected to either | or 0, or to any logic signal. Show 
how the following logic functions can be realized in the gate array. (Hint: Use DeMorgan’s 
theorem.) 

(a) f = x1x2 + x3 
(b) f = xyx2x4 + X2x3X4 +X] 


B.51 What logic gate is realized by the circuit in Figure PB.10? Does this circuit suffer from any 
major drawbacks? 


Figure PB.10 — Circuit for Problem B.51. 


REFERENCES 823 


*B.52 What logic gate is realized by the circuit in Figure PB.11? Does this circuit suffer from any 
major drawbacks? 


Figure PB.11 == Circuit for Problem B.52. 
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answers 


CHAPTER I 


1.1. (a) 10100 (b) 1100100 (c) 10000001 (d) 100000100 (e) 10100000000000 


1.4. (a) 10001 (b) 100001 (c) 1000011 (d) 10000010 (e) 101000000000 
(f) 1100100000000000 


1.6. (a)9 (b)28 (c)63 (d) 2730 
1.8. (a9 (b)10 (10 @ll 


CHAPTER 2 


2.7. (a) Yes (b) Yes (c) No 
2.12. f = x1%3 +2%2%3 + %2%3 
2.15. f= (1 +»x)G2 + %3) 
2.20. f = 2x3 4+-2%1%3 
2.23. f = (1 +x) +s) 
2.28. f = xx. +. x1%3 + x19%3 
2.32. f = (x1 +2 +%3)(01 + Xo +13) %1 + Xo + X3) HK + x2 +23) 
2.33. f = X03 + Xx. + x13 + x1 %2H3 
2.37. SOP form: f = X1x2 + 2x3 
POS form: f = (X; + X2)(x%2 + x3) 
2.38. SOP form: f = x)X2 + x1x3 + 2x3 
POS form: f = (x +.x3)(x) + X2)(%2 + x3) 


2.41. SOP form: Sf = X3X5 +.X3x4 + XoX4K5 + X1xX3XaXs + XyXoX4X5 
POS form: f = (%3+-x4-+%5)(%3 +2%4 +X5) (2 +3 +24) (Ky +43 + X4 +5) (K+. +44 +%5) 


2.45. f = xpxox3 + xp Xox4 + xy x3x4 + XVNZXG 


825 


826 ANSWERS 


2.47. The statement is false. As a counter example consider f (x, x2, x3) = >- m(0, 5, 7). Then, 


the minimum-cost SOP form f = x).x3-+%1X2%3 is unique. But, there are two minimum-cost 
POS forms: 


f = (1 +3) (1 +:.3)(4) +2) and 
f = (1 +53) (1 + x3) 2 + 3) 


2.48. Inacombined circuit: 


Sf = X2x3X4q + X1X3X4 + XyXNZX4 + X1X2X3 


& = XqN3Xq + XjAZX4 HP XXQAZX4 A Xj XIN4 
The first 3 product terms are shared, hence the total cost is 31. 


2.50. f = (er +x4 +45) + (Xr +2 +43) + (%1 +X +23) > %1 +X4 + x5) 
2.54. The circuit is 


2.56. The circuit is 


2.66. Representing both functions in the form of Karnaugh map, it is easy to show that f = g. 


| CHAPTER 3 


3.1. (a) 478 (b) 743 (c) 2025 (d) 41567 (e) 61680 
3.2. (a) 478 (b) —280 (c) —1 
3.3. (a) 478 (b) —281 (c) —2 


3.4. 


3.11. 


3.12. 


3.13. 
3.17. 


3.21. 
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The numbers are represented as follows: 


Decimal Sign and Magnitude 1’s Complement 2’s Complement 
73 000001001001 000001001001 000001001001 
1906 011101110010 011101110010 011101110010 
—95 100001011111 111110100000 111110100001 
—1630 111001011110 100110100001 100110100010 


Yes, it works. The NOT gate that produces c; is not needed in stages where i > 0. The 
drawback is “poor” propagation of c¢; = | through the topmost NMOS transistor. The 
positive aspect is fewer transistors needed to produce C;+. 


From Equation 3.4, each c; requires i AND gates and one OR gate. Therefore, to determine 
all c; signals we need )7)_,(i + 1) = (n? + 3n)/2 gates. In addition to this, we need 3n 
gates to generate all g, p, and s functions. Therefore, a total of (n? +9n) /2 gates are needed. 


84 gates 


The code in Figure P3.2 represents a multiplier. It multiplies the lower two bits of Input by 
the upper two bits of Input, producing the four-bit Output. 


A full-adder circuit can be used, such that two of the bits of the number are connected as 
inputs x and y, while the third bit is connected as the carry-in. Then, the carry-out and sum 
bits will indicate how many input bits are equal to 1. The circuit is 


£2 2] 


Cc 


828 


ANSWERS 


4.3. 


4.5. 


4.10. 
4.11. 


4.13. 


| CHAPTER 4 


Wi Wow3| f 


0 0 0 1 
0 0 1 0 
0 1 0 1 
01 1 1 
1 0 0 0 
1 0 1 0 
1 1 0 1 
1 11 0 
Wo “ 
W3 
The derived circuit is 
Wi 
W 
W3 
f 


F(W1, Wo, .++5 Wa) = [Wi +f, wa, ...,Wa))- [Wi +f, wa, ..., Wn)] 
Expansion of f in terms of w2 gives 
f = w2(w1 + W3) + wo(w1W3) 
= W2 ® (Ww) + W3) 
= w2 © wiW3 
The cost of this circuit is 2 gates + 4 inputs = 6. 
a= w3+wewo + Ww, + W2Wo 
b= Ww Wo + wiWo + W2 


C=W2+W,+Wo 
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4.18. The code in Figure P4.2 is a 2-to-4 decoder with an enable input. It is not a good style 
for defining this decoder. The code is not easy to read. Moreover, the Verilog compiler 
often turns if statements into multiplexers, in which case the resulting decoder may have 
multiplexers controlled by the En signal on the output side. 


CHAPTER 5 


5.3. The circuit is 


S 
Q 

Clk 
Q 

R 


5.5. A possible circuit is 


Clock 


Q(t + 1) 


S 
0 
0 
1 


oro|w 


= 
= 
j=) 
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5.8. The circuit acts as a negative-edge-triggered JK flip-flop, in which J = A, K = B, 
Clock = C,Q=D,andQ= E. 


5.15. The following circuit implements the desired counter 


Up/down 


5.17. The counting sequence is 000, 001, 010, 111. 


5.21. The longest delay in the circuit is the from the output of FFo to the input of FF;. This delay 
totals 5 ns. Thus the minimum period for which the circuit will operate reliably is 


Tmin =5+3+1=9ns 


The maximum frequency is 


Fmax — 1/Tin = 111 MHz 


5.26. A suitable circuit is 
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CHAPTER 6 
6.1. The expressions for the inputs of the flip-flops are 
Dy = Yz = Wy2 + YY 
Di =Y= wy Oy2 
The output equation is z = y,y2. 
6.2. The expressions for the inputs of the flip-flops are 
In =Y, 
K> =w 
J ,= Wwy2 + WY 
Ki=J 
The output equation is z = y,y2. 
6.5. Minimal state table is 
Present Next state Output 
state a 
w=0 w=1 
A A B 0 
B E Cc 0 
Cc D Cc 0 
D A F 1 
E A F 0 
F E Cc 1 
6.6. Minimal state table is 
Present Next state Output z 
state 


w=0 w=1 w=0 w=1 


GaAW>S> 


A B 0 0 
D Cc 0 0 
D Cc 1 0 
A B 0 1 
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6.12. Minimal state table is 


Present Next state Output 
state Wit ral Pp 
A B Cc 0 
B D E 0 
C E D 0 
D A FE 0 
E F A 0 
F B Cc 1 


6.15. The next-state expressions are 
D4 = Y4 = wy3+wy, 
D3 = Y3 = w(yi + ya) 
Dz = Y2 = wy2+wy4 
D, =Y; = w2+y1) 
The output is given by z = yy. 
6.17. Minimal state table is 


Poent Next state Output z 
state w=0 w=1 w=0 w=1 
A A Cc 0 0 
Cc F C 0 1 
F Cc A 0 1 


6.21. The desired circuit is 
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6.22. The desired circuit is 


6.29. The state table is 


Present Next state Output 
state a0 1 es 
A A Cc 0 
B A D 1 
C A D 0 
D A B 0 


The circuit produces z = 1 whenever the input sequence on w comprises a 0 followed by 
an even number of Is. 


CHAPTER 8 


8.1. f= (G3 ta) t (eg tg) t x4) where g = (1 T 2 t2)) T (Or x1) T22) 


8.2. f= (O31) +8) 1 (gs) ) Gad x4), where _ 
g= (1 da) ba) d Grd G2 |). Then, f =f 1 f. 


8.5. f = X1 (x2 +3) (x4 + x5) + x1 (Ko + x3)(X4 + 25) 
8.8. f=g-h4+z-h, where g = xx) andh = x3 4+ x4 
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8.12. The BDD is 


8.18. f = XyX2%4 + Xp x2%3 + xp HX3 + xW3X4 


| CHAPTER 9 


9.1. The flow table is 


Present Next state 
state | wow, =00 01 10s 22% 
A D C D C 11 
B DD 10 
Cc D ©) D (c) 01 
: (v) oe ete 2 00 


The behavior is the same as described in the flow table in Figure 9.21a, if the state inter- 
changes A <> D and B < C are made. 


ANSWERS 835 


9.8. Using the merger diagram in Figure 9.40a, the FSM in Figure 9.39 becomes 


9.10. 


9.12. 


Present Next state Output 
sean ener re eT i 
A Ome Wa 0 
B c D 0 
c Oy eG 1 
D Se ee) 0 
E een 1 
G nee) ean 1 


The minimum-cost hazard-free implementation is 


f = X1X3X4 + Xp x2N4 + HY AZXG 


The minimum-cost hazard-free POS implementation is 


SF = (Xp x2 +44) (K +X2 +53) (41 +43 +X4) (X2+23-+X4) 


IfA = B = D = E = 1 and C changes from 0 to 1, then f changes 0 > 1 — O and g 
changes 0 > 1 — 0 — 1. Therefore, there is a static hazard on f and a dynamic hazard 


on g. 


The excitation table 


18 


Next state Output 
Present 
state we = 00 01 10 11 00 01 10 11 
bg 
Y z 
>| O©®: Of so 
! >O@A@lr 1 o 1 


The next-state expression is Y = we+cy+wy. Note that the term wy is included to prevent 
a static hazard. The output expression is z = cy. 
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11.12. 


CHAPTER II 


A minimal test set must include the tests w;)w2w3 = O11, 101, and 111, as well as one of 
000, 010, or 100. 


The two functions differ only in the vertex x1x2x3x4 = 0111. Therefore, the circuits can be 
distinguished by applying this input valuation. 


The tests are wyw2w3w4 = 1111, 1110, 0111, and 1111. 


Cannot detect if the input wire w, is stuck-at-1. The reason is that this circuit is highly 
redundant. It realizes the function f = w3(w, + W2), which can be implemented with a 
simpler circuit. 


Test set = {0000, 0111, 1111, 1000}. It would work with XORs implemented as shown in 
Figure 4.26c. For n bits, the same patterns can be used; thus 


Test set = {00...00, O11...1, 11...1, 100... 0}. 


In the decoder circuit in Figure 6.16c the four AND gates are enabled only if the En signal is 
active. The required test set has to include all four valuations of w; and wz when En = 1. It 
is also necessary to test if the En wire is stuck at 1, which can be accomplished with the test 
Ww 1w2En = 000. Therefore, a complete test set comprises wjw2En = 000, 001, 011, 101, 
and 111. 
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B.4. 


Using the circuit 


The number of transistors needed is 16. 


B.8. 


B.12. 


ANSWERS 
The complete circuit is 
Vop 
,—4 
Ve, 
i, 
Vy 
Ve, 
.— 
Ve, 
The required circuit is 
Vop Yop 
Vi 
Ve 
v 
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B.14. 
B.17. 
B.25. 
B.28. 
B.32. 


B.45. 


B.55. 


ANSWERS 


(a) Ip = 800 pA (b) Ip = 78 pA 
Rps = 947 Q 
(a) NMy =0.5V NM,=0.7V (b) Vo, =0.8V NM, =0.2V 
(a) PNOT gate = 163 pW (b) Protal = 8.2 W 
The two NMOS transistors in a CMOS NOR gate are connected in parallel. The worst 
case current to drive the output low happens when only one of these transistors is turned 
“ON”. Thus each transistor has to have the same dimensions as the NMOS transistor in the 
inverter, namely W,,/L, = 2. 

The two PMOS transistors are connected in series. If each of these transistors had the 
ratio W,,/Lpy, then the two transistors could be thought of as one transistor with a W,/2L, 


ratio. Thus each PMOS transistor must be made twice as wide as that in the inverter, namely 
WwW, / L, = 8. 


f = x2 +%,x3. The corresponding circuit is 


xy 


X3 os 
Xy + X4X3 


The circuit in Figure PB.11 is a two-input XOR gate. This circuit has two drawbacks: when 
both inputs are 0 the PMOS transistor must drive f to 0, resulting in ff = Vr volts. Also, 
when x; = | and x2 = 0, the NMOS transistor must drive the output high, resulting in 


Ff = Vpp — Vr. 
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ABC system, 536 
Absorption property, 35 
Accumulator, 272 
Active clock edge, 332 
Active-low signal, 201 
Adder: 
BCD, 176 
carry lookahead, 146 
carry-save array, 183 
full-adder, 127 
half-adder, 32, 125 
in Verilog code, 152-163, 641 
propagation delay, 129, 151 
ripple-carry, 129 
serial, 363 
Adder/subtractor, 139 
Addition, 32, 125 
BCD, 174 
carry, 125 
generate function, 146 
overflow, 143, 648 
propagate function, 146 
sum, 125 
Verilog, 159-163 
Address, 796 
Aliasing problem in testing, 673 
Algorithm, 441 


Algorithmic state machine (ASM): 


ASM charts, 401 
ASM block, 404 
conditional output box, 402 
decision box, 402 
implied timing, 404 
state box, 401 
Alphanumeric characters, 14 
Altera Corporation, 8 
always block (Verilog), 74, 698 
Analog information, 16 
Analysis, 29, 504 
AND function, 23 
AND gate (see Gates) 
And-or-invert cells, 807 
AND plane (also AND array), 754 
Anode terminal, 304 
Arbiter circuit, 393, 571 
Arithmetic: 


floating-point (see Floating point) 


operators (Verilog), 226 
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E xX 


overflow, 143 


(See also Addition; Division; Multiplication; Subtraction) 


Arithmetic and logic unit (ALU), 218 
Arithmetic assignment (Verilog), 159 
Array multiplier (see Multiplication) 
ASCII code, 14 

ASIC, 5, 769 

ASM chart (see Algorithmic state machine) 
Aspect ratio, 794 

assign (Verilog), 70 

Associative property, 35 

Asynchronous clear (reset), 263 
Asynchronous clear (in Verilog), 294, 718 
Asynchronous counter, 269 

Asynchronous inputs, 482 

Asynchronous sequential circuit (see Sequential circuits) 
Axioms of Boolean algebra, 33 


Barrel shifter, 238, 687 
BCD (see Binary-coded decimal), 96, 176 
BCD-to-7-segment converter, 96 
BDD (see binary decision diagram) 
BDD packages, 520 
begin (Verilog), 156, 699 
begin-end block (Verilog), 156, 699 
Behavioral Verilog code, 70, 699 
BGA package, 766 
BILBO (Built-in Logic Block Observer), 673 
Binary-coded decimal (BCD), 174 
addition, 174 
counter, 280 
digits, 174 
Binary decision diagram (BDD), 514 
Binary numbers, 12 
in Verilog code, 166, 687 
Binary variable, 23 
BIST (Built-in Self Test), 669 
Bit, 12 
Bit-counting circuit, 441 
Bit-select (Verilog), 689 
Bitwise operators (Verilog), 223 
Blocking assignment (Verilog), 288, 700 
Body effect, 785 
Boolean algebra, 33 
Boundary scan, 676 
Branching heuristic, 91, 527 
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Buffer (see Driver) 1’s, 133 
Bus, 422 radix, 139 
Bypass capacitor, 677 2’s, 133 
Byte, 13 Complementary metal-oxide semiconductor (see CMOS 
technology) 
Complex gate (CMOS), 715 
Cc Complex programmable logic device (CPLD), 761 
CAD (see Computer aided design) Compressor circuit, 671 


Computer-aided design (CAD), 65 
chip configuration, 68 
design entry, 65 
functional simulation, 67 


Canonical expressions: 
canonical product-of-sums, 52 
canonical sum-of-products, 49 
Capacitance, 779 


Carry: layout synthesis (physical design), 67 
carry-in, 125 logic synthesis (optimization), 67 
carry-out, 126 technology mapping, 537, 640 

Carry chain, 645 timing simulation, 67 

Carry lookahead adder, 146 Concatenation (Verilog), 161, 227 

Carry-save adder, 183 Conditional operator (Verilog), 210 

case statement, 215, 702 Consensus property, 40 

casex statement, 220, 703 Consistency check, 659 

casez statement, 220, 703 Constant (in Verilog), 166 

Cathode terminal, 304 Continuous assignment (Verilog), 70, 696 

Channel (in MOSFET), 774 Control circuit, 422 

Character codes, 14 Cost, 50, 89 

Characteristic impedance, 678 Counter: 

Characteristic table, 249 asynchronous, 269 

Chip configuration, 68 BCD, 280 

Circuit size, 764 down, 271 

Clear input, 260 enable and clear capability, 274 

Clock, 251 Johnson, 283 

Clock divider, 304 modulo-n, 278 

Clock enable, 301 parallel load of, 276 

Clock network, 480 reset of, 278 

Clock skew, 312 ring, 281 

Clock synchronization, 478 ripple, 271 

Clock-to-Q delay (t.g), 263 synchronous, 272 

Clock-to-output time (fo), 285 up, 270 

CMOS technology, 655 up/down, 272 

Code: Verilog code, 298-300, 721 
BCD, 96 Cover, 88 
converter, 208 minimal, 88 
error-detecting, 241 table, 522 
Gray, 82, 235 Critical path, 145, 310 
One-hot, 201, 281 Crossbar, 193 

Cofactor, 198, 516 Cross-coupled gates, 249 

Coincidence operation, 128 Crosstalk, 677 

Column dominance, 524 Cubical representation, 510 

Combinational circuits, 190-242 Current flow: 

Combining property, 35, 79 dynamic, 731 

Comment (Verilog), 72 gate, 726 

Commutative property, 35 leakage, 777 

Comparator, 208, 227 short circuit, 784 

Compatible states, 581 static, 731 

Complement: Custom chips, 5, 769 
diminished radix, 142 Cut-off region, 726 


of a logic variable, 26 Cut set, 564 


D flip-flop (see Flip-flop) 
D-algorithm, 661 
Datapath, 422 
DC-set, 94 
Debouncing, 483 
Decimal numbers, 12 
Decision tree, 514 
Decoder, 201, 433 

tree, 203 
Decomposition (see Functional decomposition) 
default case alternative (Verilog), 215 
defparam (Verilog), 163 
Delay (see Propagation delay) 
Delay (in Verilog), 694, 698 
DeMorgan’s theorem, 35, 42 
Demultiplexer, 203 
Design, 6 
Design entry, 65 
Design for testability, 665 
Design process, 6 
Digital hardware, 2 
Digital information, 16 
Digital system, 422 
Diminished radix complement, 142 
Disjoint decomposition, 500 
Distributive property, 35 
Division, 455 
Don’t-care condition, 94 

in Verilog code, 704 
Double precision (see Floating point) 
Down-counter, (see Counter) 
Drain (in MOSFET transistor), 735 
Driver, 751,791 

inverting, 793 

tri-state, 792 
Duality, 34 
Duty cycle, 489 
Dynamic hazard, 609, 613 


E 
Edge (in signals), 257 
Edge-triggered, 256, 267 


Electrically-erasable programmable read-only memory (EEPROM), 


799 

Enable input, 201, 274, 301 
Encoder, 205 

binary, 205 

priority, 205 
end (Verilog), 699 
Energy (capacitor), 732 
Equality operators (Verilog), 226 
Equivalence: 

of logic expressions, 31 

of states, 374 


INDEX 


Equivalent-gates metric, 764 

Erasable programmable read-only memory (EPROM), 801 
Errors in Verilog code, 706 

Escaped identifier, 687 

Espresso, 536 

Essential prime implicant, 89 

Event expression (Verilog), 24, 703 
Excess-127 format, 173 

Excess-1023 format, 173 

Excitation table, 388, 556 
Exclusive-NOR (XNOR) gate (see Gates) 
Exclusive-OR (XOR) gate (see Gates) 
Expansion theorem (Shannon’s), 198, 515 


F 


Factoring, 493 
Fall time, 781 
Fan-in, 151, 493, 788 
Fan-out, 790 
Fault: 
detection, 659 
model, 654 
propagation, 659 
stuck-at, 654 
Feedback, 248 
Field-programmable gate array (FPGA), 5, 764, 804 
Finite state machine (FSM), 333 
incompletely specified, 381 
summary of design procedure, 340 
Fixed-point numbers, 170 
Flip-flop, 258 
Flip-flops: 
clear and preset inputs, 260 
clock-to-Q delay (t.g), 263 
configurable (in PLDs), 285 
D, 256-262 
edge-triggered, 256, 258, 267 
JK, 264 
hold time, 263 
master-slave, 256, 558 
setup time, 263 
SR, 322 
T, 263 
Timing parameters, 263 
Verilog code for, 288, 717 
Floating gate, 799 
Floating point, 172 
double precision, 173 
exponent, 172 
format, 172 
IEEE standard, 172 
mantissa, 172 
normalized, 172 
representation, 172 
single precision, 172 
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Flow table, 556 

primitive, 578 

state reduction, 577 
Finax, 310 
for loop statement, 156, 221, 704 
Fowler-Nordheim tunneling, 799 
FPLA (see PLA) 
FSM (see Finite state machine) 
Full-adder, 127 
function (Verilog), 231, 713 
Functional behavior, 29 
Functional decomposition, 493 
Functional equivalence, 31 
Functional simulation, 67 
Fundamental mode, 552, 608 
Fuse map, 720 


G 


Gate (in MOSFET transistor), 735 
Gate array, 770 
Gate delay (see Propagation delay) 
Gate optimization, 638 
Gated D latch, 253, 285, 717, 788 
Gated SR latch, 251 
Gates: 

AND, 27, 738 

NAND, 54, 738, 741 

NOR, 54, 738, 743 

NOT, 27 

OR, 27, 738 

XNOR, 128 

XOR, 32, 128 
generate (Verilog), 157, 228, 712 
genvar (Verilog), 157, 228, 712 
Glitch, 561 
Global signals, 480 
Gray code, 82, 235 


H 

H tree, 480 

Half-adder, 32, 125 
Hamming distance, 592 
Handshake signaling, 571 


Hardware description language (HDL), 65 


Hazard, 556, 608 

dynamic, 609, 613 

static, 609 
Heuristic approach, 90 
Hexadecimal numbers, 123 
Hierarchical design, 65 
Hierarchical Verilog code, 163 


High-level behavioral Verilog code, 309 


High-impedance output, 220, 422 
Hold time, 256, 263 


Huntington’s postulates, 35 
Hypercube, 513 


Identifier (Verilog), 687 
IEEE, 65 
IEEE standards (see Standards) 
if-else statement, 212, 700 
Implicant, 87 
Implied memory (Verilog), 286, 702 
Incompletely specified FSM, 381 
Incompletely specified functions, 94 
initial block (Verilog), 699 
Input variable, 23 
Instantiation (of Verilog gates), 71, 110, 694 
Instantiation (of Verilog modules), 76, 78, 164, 165, 709 
Instrumentation, 679 
In-system programming (ISP), 760 
Integer: 
in Verilog, 159, 689 
signed, 132 
unsigned, 132 
Integrated circuit (IC), 2 
International Technology Roadmap for Semiconductors (ITRS), 3 
Intersection, 38 
Inversion, 26 
Inverter (NOT gate), 27, 738 


J 

JK flip-flop, 264 
Johnson counter, 283 
JTAG port, 764 


K 


Karnaugh map, 79 
k-cube, 513 
k-successor, 374 


L 


Large scale integration (LSI), 753 
Latch: 

basic SR, 249 

gated D, 253, 285 

gated SR, 251 

set-dominant SR, 322 

Verilog code, 287, 717 
Leakage current, 777 
Least-significant bit, 13 
LED (Light emitting diode), 304 
Level-sensitive element, 254, 258 
Level-sensitive scan design, 669 


Libraries, 65 
Linear feedback shift register (LFSR), 326, 670 
Linear region (see Triode region) 
Literal, 87 
Logic analyzer, 679 
Logic block, 545 
Logic circuit, 22, 29 
Logic expression, 23 
Logic functions, 23 
AND, 27 
NAND, 54 
NOR, 54 
NOT, 27 
OR, 27 
synthesis, 29 
XNOR, 128 
XOR, 128 
Logic gates, 27 
drive capability, 791 
dynamic operation, 729 
fall time, 781 
fan-in, 493, 788 
fan-out, 790 
noise margin, 778 
power dissipation, 731 
propagation delay, 780 
rise time, 781 
transfer characteristic, 777 
Verilog gates, 695 
Logic network, 28 
Logic values, 22 
Logical operators (Verilog), 225 
Logical product (AND), 43 
Logical sum (OR), 43 
Lookup table (LUT), 766 
Loop statement (see for loop) 


M 


Macrocell, 758 
Magnitude, 132 
Majority function, 115 
Master (see Flip-flop, master-slave) 
Master-slave (see Flip-flop) 
Maxterm, 52 
Mealy FSM, 332, 349 

Verilog code, 363, 724 
Mealy output, 402 
Mean operation, 466 
Medium-scale integration (MSI), 753 
Memory, 795 


implied memory (Verilog), 286, 436, 702, 690 


Merger diagram, 581 
Merging, 578 
procedure, 581 
Metal-oxide semiconductor, 734 
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Metastability, 263, 483 
Minimization: 
of logic functions, 89 
of states, 372, 577 
Minterm, 48 
Mixed logic, 748 
module (Verilog), 70, 692 
Moore FSM, 332 
Verilog code, 355, 723 
Moore output, 401 
Moore’s law, 3 
MOSFET transistor, 734 
on-resistance, 776 
Most-significant bit, 13 
Motherboard, 8 
Multibit assignment (in Verilog), 160 
Multilevel circuits, 492, 502 
Multiple-output circuits, 96 
Multiplexer, 62, 190 
Multiplexer (Verilog code), 211 
Multiplication, 167 
array implementation, 167 
carry-save array, 183 
partial product, 167 
sequential implementation, 446 
signed-operand, 169 
Mutual exclusion element (ME), 577 


Named port connection, 165, 709 
Names (Verilog), 687 
NAND circuits, 502, 738, 741 
NAND gate (see Gates) 
n-cube, 513 
Negative edge, 257 
Negative logic, 733, 747 
Negative numbers, 133 
negedge (Verilog), 287, 717 
Net (in Verilog), 158, 686, 688 
Network, 27 
Next state, 336, 552 
Nibble, 13 
9’s complement, 140 
NMOS technology, 736 
NMOS transistor, 734 
Noise, 778 

margin, 778 

power supply, 677 
Non-blocking assignment (Verilog), 289, 700 
Non-disjoint decomposition, 500 
Nonvolatile programming, 764 
NOR gate (see Gates) 
NOR circuits, 502, 738, 741 
NOR plane, 798 
NOT function, 27 
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NOT gate (see Gates) 

Number conversion, 13 

Number representation: 
binary coded decimal, 174 
fixed-point, 170 
floating-point, 172 
hexadecimal, 123 
octal, 123 
1’s-complement, 133 
positional notation, 122 
sign and magnitude, 133 
signed integer, 132 
10’s-complement, 140 
2’s-complement, 133 
unsigned integer, 122 
in Verilog, 166, 687 


fo) 


Octal numbers, 123 

Odd function, 128 

One-hot encoding, 203, 281, 347, 607 
1’s-complement representation, 133 
1149.1 Standard, 676 
On-resistance, 776 

ON-set, 534 

Operations (see Logic functions) 
Operators (Verilog), 223, 690 
Optimization (see Minimization) 
OR function, 23 

OR gate (see Gates) 

OR plane, 754 

Ordered port connection, 165, 709 
Ordering of statements, 292 
Oscilloscope, 679 

Output delay time (f,), 481 
Overflow (see Arithmetic overflow) 


P 
Packages (physical): 
ball grid array (BGA), 766 
dual inline (DIP), 749 
pin grid array (PGA), 765 
plastic-leaded chip carrier (PLCC), 763 
quad flat pack (QFP), 762 
small-outline integrated circuit (SOIC), 751 
PAL, 757 
Parallel-to-serial converter, 268, 415 
Parallel transfer, 267 
parameter (Verilog), 156, 688 
Parasitic capacitance, 780 
Parity, 240, 413, 565 
Partial product, 167 
Part-select (Verilog), 689 
Pass transistor, 804 


Path sensitizing, 657 
Physical design, 644 
Pins, 646 
PLA, 754, 799 
Placement, 646, 676 
PLD, 5, 753 
PMOS transistor, 734 
Polysilicon, 772 
Port (Verilog), 70, 692 
Portability, 66 
posedge (Verilog), 287, 717 
Positional number representation, 12, 122 
Positive logic, 733 
Power dissipation, 782 
Precedence of operations, 43 
Present state, 336, 552 
Preset input, 260 
Price/performance ratio, 145 
Prime implicant, 88 
Primitive flow table, 578 
Printed circuit board (PCB), 3, 676 
Priority: 

encoder, 205 

in Verilog code, 221, 223 
Procedural statements (Verilog), 74, 698 
Process transconductance parameter, 774 
Processor, 432 
Product-of-sums form (POS), 52 
Programmable array logic (see PAL) 
Programmable logic array (see PLA) 
Programmable logic device (see PLD) 
Programmable ROM (PROM), 768 
Propagation delay, 67, 129, 151 
Properties of Boolean algebra, 34 
Pseudo-NMOS technology, 777 
Pseudorandom tests, 670 


Pseudorandom binary sequence generator (PRBSG), 670 


Pull-down network, 473 
Pull-up network, 473 
Pull-up resistor, 738 
Pulse mode, 552 


Q 
Quine-McCluskey method, 521 


Race condition, 567 

Radix complement, 139 

RAM (see Static random access memory) 
Random testing, 662 

Read-only memory (ROM), 804 
Reaction timer, 302 

Reduction operators (Verilog), 225 
Reflections, 677 


reg (Verilog), 156, 689 
Register, 267 

Verilog code, 295, 718 
Register-Transfer Level (RTL) code, 309, 726 
Relational operators (Verilog), 226 
Reliability, 679 
Replication operator (Verilog), 167, 227 
Reset input, 335 
Reset state, 335 
Resistance (transistor channel), 776 
Ring counter, 281 
Ring Oscillator, 324 
Ripple-carry adder, 129 
Ripple counter, (see Counter) 
Rise time, 781 
ROM (see Read only memory) 
Routing, 647, 677 

channel, 769 
Row dominance, 523 


s 


Saturation region, 774 
Scalar (Verilog), 159, 688 
Scan path, 666 
Schematic, 27 
Schematic capture, 65, 151 
Sea-of-gates technology, 771 
Semiconductor, 772 
Semi-custom chips, 5 
Sensitivity list (Verilog), 74, 699 
Sequence detector, 341 
Sequential circuits, 248, 332 
analysis, 397, 556 
asynchronous, 551-630 
definition of, 332 
excitation table, 388 
finite state machine, 333 
flow table, 556 
formal model, 405 
merger diagram, 581 
reset, 335 
state, 332 
state assignment, 336, 344, 592 
state assignment in Verilog, 361 
state diagram, 334 
state reduction, 372, 577 
state table, 335 
synchronous, 332-415 
testing, 665 
transition diagram, 595 
Sequential statement (Verilog), 698 
Serial adder, 363 
Serial parity generator, 565 
Serial transfer, 267 
Series-to-parallel converter, 268 
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Set input, 260 
Setup time, 256, 263 
7400-series chips, 749 
7-segment display, 63 
BCD-to-7-segment converter, 96 
hex-to-7-segment converter, 208 
Shannon’s expansion, 198, 515 
Sharp-operation (#-operation), 529 
Shift operators (Verilog), 226 
Shift register, 267 
Verilog code, 295, 720 
Sign bit, 132 
Signal value (Verilog), 687 
Sign-and-magnitude representation, 133 
Signature, 671 
Signature analysis, 675 
Sign extension, 167, 697 
Signed numbers, 132 
Silicon wafer, 2 
Simplification (see Minimization) 
Simulation, 67 
Single-precision (see Floating point) 
Skew (see Clock skew) 
Slack (timing), 649 
Slave (see Flip-flop, master-slave) 
Small-scale integration (SSD), 753 
Socket, 760 
Sort operation, 470 
Source (in MOSFET transistor), 735 
SR latch (see Latch) 
Stable state, 552 
Standard cells, 769 
Standard chips, 4 
Standards: 
IEEE floating-point, 172 
1149.1 (Testing), 676 
Verilog 1364-1995, 68 
Verilog 1364-2001, 68 
VHDL, 65 
Star-operation (s-operation), 529 
Starting state, 334 
State, 248, 332 
assignment, 336, 344, 592, 607 
assignment in Verilog, 361 
compatibility, 581 
diagram, 334 
equivalence, 374 
reduction, 372, 577 
definition of, 332 
table, 335 
variables, 336 
State-adjacency diagram, 595 
State-assigned table, 337 
State machine (see Finite state machine) 
Static hazard, 609 
Static random access memory (SRAM), 794 
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Structural Verilog code, 70 

Stuck-at fault, 654 

Substrate, 734 

Subtraction, 137 

Sum, 125 

Sum-of-products form (SOP), 49 
Synchronize (to clock), 482 
Synchronous clear, 263 

Synchronous clear (Verilog), 294, 718 
Synchronous counter, 272 


Synchronous sequential circuits (see Sequential circuits) 


Synthesis, 29, 45 
multilevel, 492 


T 
T flip-flop, 263 
tg, 263, 310 
teo, 285 
tn, 256, 263 
tsu, 256, 263 
task (Verilog), 229, 715 
Technology mapping, 537 
10’s complement, 140 
Terminal node (in BDD), 514 
Terminations, 678 
Test, 654 
Test generation, 656-665 
Test set, 655 
Test vectors (see Test generation) 
Testing, 653-680 
Theorems of Boolean algebra, 34 
Three-state output (see Tri-state) 
Threshold voltage, 733, 774 
Timer, 488 
Timing analysis, 310, 320, 648 
Timing Analyzer, 648 
Timing constraints, 649 
Timing diagram, 29, 339 
Timing simulation, 67 
Transfer characteristic, 777 
Transistor: 

EEPROM, 799 

EPROM, 801 

MOSFET, 734 

size, 773 
Transition diagram, 595 
Transition table (see Excitation table) 
Transmission gate, 786 
Transmission line effects, 678 
Triode region, 774 
tri (Verilog), 427 
Tri-state: 

driver, 792 

Verilog code, 427 


Truth table, 26 
2’s-complement representation, 133 


U 


Union, 39 
Unsigned numbers, 122 
Unstable state, 561 
Up-counter (see Counter) 
Up/down-counter (see Counter) 
User-programmable device (see PLD) 
Universal shift register, 323 
Valuation, 26 
Variable, 23 
in Verilog, 75, 159, 689 
Vector (Verilog), 155, 688 
Venn diagram, 37 
Verilog HDL, 685-729 
assign keyword, 70 
asynchronous clear, 294, 718 
always block, 74, 698 
arithmetic assignment, 159 
arithmetic operators, 226 
blocking assignment, 288, 700 
begin-end block, 156, 699 
bit-select, 689 
bitwise operators, 223 
case statement, 215, 702 
casex statement, 220, 703 
casez statement, 220, 703 
comment, 72, 686 
concatenation, 167, 227 
concurrent statement, 696 
conditional operator, 210 
constant, 166 
continuous assignment, 70, 696 
default case alternative, 215 
defparam, 163 
delay, 694, 698 
don’t care, 220 
end, 699 
equality operators, 226 
escaped identifier, 687 
for loop, 156, 221, 704 
function, 231, 713 
gate level primitives, 694 
generate, 157, 228, 712 
genvar, 157, 228, 712 
hierarchical code, 163 
identifier, 687 
if-else statement, 212, 700 
implied memory, 286, 702 
initial block, 699 
integer type, 159, 689 
instantiation of gates, 71, 110, 694 


instantiation of modules, 76, 78, 164, 165, 709 


uploaded by [stormrg] 


logic gates, 695 

logical operators, 225 
memories, 690 

module, 70, 692 

multibit assignment, 160 

named port connection, 165, 709 
names, 687 

negedge, 287, 717 

net, 158, 686, 688 

non-blocking assignment, 289, 700 
number representation, 166, 687 
operators, 223, 690 

ordered port connection, 165, 709 
parameter, 156, 688 

parameter override, 163, 710 
part-select, 689 

posedge, 287, 717 

precedence, 228 

procedural statements, 74, 698 
port, 70, 692 

radix, 687 

range, 686, 689 

reduction operators, 225 

reg, 156, 689 

relational operators, 226 

repeat loop, 706 

replication, 167, 227 

scalar, 159 

sensitivity list, 74, 699 

# operator, 163 

shift operators, 226 
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sign extension, 167, 697 
signal, 686 
signal value, 687 
synchronous clear, 294, 718 
task, 229, 715 
tri type, 689 
truth tables, 704 
variable, 75, 159, 689 
vector, 155, 689 
while loop, 706 
white space, 686 
wire type, 78, 158, 688 
Vertex, 510 
Very large-scale integration (VLSD, 753 
VHDL, 65 
Via, 769 
Volatile programming, 768 
Voltage levels 
substrate bias, 785 
high, low, 734, 776 
Voltage transfer characteristic, 777 


Ww 


while loop (Verilog), 706 
wire (Verilog), 78, 158, 688 


Xx 


XNOR (Exclusive-NOR) gate (see Gates) 
XOR (Exclusive-OR) gate (see Gates) 
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